Java – Validate CSV using SuperCSV with custom CsvColumnProcessor

Validating CSV is super easy with SuperCSV and it will save your day.

Assume that we have a requirement from the management to validate new employee data coming from different parts of the United States with the following constraints.

New Employee Validation Rules

1EMPLOYEE IDEmployee Id provided by HQ for a particular office branch.Must be unique within the CSV file.

Maximum length is 10 characters

Format ##########

2LAST NAMELast name of the new employeeMinimum and maximum length are 1 and 50 letters only, respectively.

3FIRST NAMEFirst name of the new employeeMinimum and maximum length are 1 and 50 letters only, respectively.

4SSNSSN Format:


5HOME STATECurrent home state of the employee. Most like be the branch office's state2-letter state name

6COUNTRYUS only. If empty, defaults to USUS only. If empty, defaults to US

7HIRE DATEEmployee's hire dateFormat MM/DD/YYYY

8COMMENTAny comments about the new employee from the hiring managerMax 100



First, we need to include a Maven dependency to the SuperCSV binaries.

Cell Processors

SuperCSV has these Cell processors that are used for reading and writing CSV files. They automate type conversions, and can enforce constrains on each cell.

There are 4 types of cell processors – Reading, Writing, Reading/Writing, and Constraints. Please see http://super-csv.github.io/super-csv/cell_processors.html for more information.

On this post, we’ll use custom Constraints cell processors.

Interfaces and Classes

Keeping in mind that they may be other CSV-related tasks will be handed to you in the future, we’ll try to write good codes as much as possible so that things are easier to extend.

So, we start with an interface.

Then, we implement the interface.

You’ll notice there are 8 “rules” for each CSV column (each cell actually) to validate against.

Let’s look at one of them. The other files will be available in github.com link posted below.


The employee id “rule” looks like this.

We extend CellProcessorAdaptor and overriden the execute method. Notice, we used a Set object to track the list of unique Employee IDs in the CSV file. Once duplicate is detected, we throw

Also notice, we used a regular expression to ensure Employee IDs are 10-digit long.


We create another class to read the CSV file and at the same time validate its contents.


If you want to look at the other files, please download the source code files from https://github.com/Turreta/turreta-supercsv-validation-example