Compare Two Excel Files in Java

Comparing two or multiple excel workbooks is very common requirement specially for automated test scenarios. In this section, we will learn how to compare two excel workbooks or to verify if two workbooks have same data set.

Comparing Excel Files

We must not start comparing excel sheets at cell level or cell data directly. We should first check for basic conditions to be true then proceed for cell comparison.

Basic ordered conditions to be checked first are as follows:

  • Do workbooks have same number of sheets?
  • We can change sheet order as needed. I am changing sheet order while keeping the same order for cell data.
  • Do workbooks have same numbers of rows?
  • Do workbooks have same numbers of columns/cells of corresponding rows? Corresponding rows mean each row may have different number of columns.

When above conditions are checked are passed then we should go for cell data comparison. If either check fails then we should not proceed further.

Check If Both Excel Workbooks Have Same Number of Sheets

To get number of sheets in a workbook, we need to use getNumberOfSheets() of Workbook interface provided by Apache POI. Workbook interface is implemented by major classes like XSSFWorkbook , HSSFWorkbook.

Check If Both Excel Workbooks Have Sheets of Same Name

Here we have two scenarios.

First - Just verify if sheet names are same irrespective of sheet orders.

Second - Verify sheet names are same and appears in same order.

Check If Both Workbooks Have Same Number of Rows In All Sheets

We need to check if number of rows in each sheet of workbooks are same. If sheet1 of workbook 1 has 10 rows and sheet2 has 15 rows then workbook 2 should also have 10 rows in sheet1 and 15 rows in sheet2.

To get number of rows in a sheet we need to use getPhysicalNumberOfRows() method of Sheet interface. This method returns the number of physically defined rows (NOT the number of rows in the sheet). If we have 10 rows in a sheet, each row will be indexed or numbered staring from zero. Same we need to do for all sheets. Below is an example code for one sheet.

Check If Both Workbooks Have Same Column for Row

A row can have different number of columns. So we need to check for each row in all sheets. To iterate through rows, we can use rowIterator() method of Sheet interface that returns an iterator of the physical rows. During iteration of rows, we need to get physical number of cells. Same logic need to do for all sheets.

Now we need to check for cell data and assert.

Each cell has a type like String, Numeric, Boolean etc. To get value as per type, different method needs to be used. For example, to get data from a Numeric cell type, we need to use getNumericCellValue() and getStringCellValue() for String cell type.

If a cell consists Date, we do not have any such type for Cell. A Cell with date falls under Numeric type. We need to check if cell is date formatted. We can check using DateUtil.isCellDateFormatted(Cell cell).

CompareExcelFiles.java

Output:

The cell values at (0, 0) are the same
The cell values at (0, 1) are the same
The cell values at (0, 2) are the same
The cell values at (0, 3) are the same
The cell values at (1, 0) are the same
The cell values at (1, 1) are the same
The cell values at (1, 2) are the same
The cell values at (1, 3) are the same
The cell values at (2, 0) are the same
The cell values at (2, 1) are the same
The cell values at (2, 2) are the same
The cell values at (2, 3) are the same
The cell values at (3, 0) are the same
The cell values at (3, 1) are the same
The cell values at (3, 2) are the same
The cell values at (3, 3) are the same

Note: In the above program, assumed that file1.xls and file2.xls contain two sheets with the same structure and values.






Latest Courses