What is UniVocity-parsers?
uniVocity-parsers is a high-performance library. It helps Java developers parse and write data. It supports CSV, TSV, fixed-width, and more. The library is open-source and developer-friendly.
Why Choose uniVocity-parsers?
Many libraries can parse data. But UniVocity-parsers stands out for a few reasons:
- It’s fast.
- It’s flexible.
- It supports annotations.
- It handles edge cases well.
- It has strong documentation.
Version 4.11.0 adds performance improvements and bug fixes. It’s more stable than previous versions.
Setting Up uniVocity-parsers
You can include it using Maven:
<!-- https://mvnrepository.com/artifact/org.apache.camel/camel-univocity-parsers -->
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-univocity-parsers</artifactId>
<version>4.11.0</version>
</dependency>Parsing CSV Files
Parsing CSV is easy. First, create a CsvParserSettings object.
CsvParserSettings settings = new CsvParserSettings(); settings.setHeaderExtractionEnabled(true);
Then, create the parser:
CsvParser parser = new CsvParser(settings);
List<String[]> rows = parser.parseAll(new FileReader("data.csv"));This code reads all lines from a file named data.csv. It returns a list of string arrays.
Prepare the data
// Create sample data List<Employee> employees = new ArrayList<>(); employees.add(new Employee(1, "John Smith", "[email protected]", 32, "Engineering", 85000.00)); employees.add(new Employee(2, "Jane Doe", "[email protected]", 28, "Marketing", 72000.00)); employees.add(new Employee(3, "Bob Johnson", "[email protected]", 45, "Finance", 95000.00)); employees.add(new Employee(4, "Alice Brown", "[email protected]", 37, "HR", 68000.00)); employees.add(new Employee(5, "Charlie Black", "[email protected]", 41, "Sales", 78000.00));
Advanced settings
CsvParserSettings settings = new CsvParserSettings();
settings.setDelimiterDetectionEnabled(true, ','); // Set delimiter character
settings.setQuoteDetectionEnabled(true); // Set quote character
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE);
settings.setKeepQuotes(true);
settings.setNullValue("N/A"); // Set text to represent null values
settings.setEmptyValue(""); // Set text to represent empty values
settings.setSkipEmptyLines(true); // Skip empty lines when writing
settings.setHeaders( "Name", "Age", "Country"); // Define headers
settings.getFormat().setComment('#'); // Set comment characterWriting with a BeanWriterProcessor
Writing CSV is just as simple. Use CsvWriterSettings:
package com.example.jobrunr.univocity;
import com.univocity.parsers.csv.CsvWriter;
import com.univocity.parsers.csv.CsvWriterSettings;
import java.io.FileWriter;
import java.io.IOException;
public class WriteCSV {
public static void main(String[] args) throws IOException {
CsvWriterSettings writerSettings = new CsvWriterSettings();
CsvWriter writer = new CsvWriter(new FileWriter("D:\\logs\\csv\\univocity\\output.csv"), writerSettings);
writer.writeHeaders("Name", "Age", "Country");
writer.writeRow("Alice", "30", "USA");
writer.writeRow("Bob", "25", "Canada");
writer.close();
}
}You now have a clean output.csv file.
Name,Age,Country Alice,30,USA Bob,25,Canada
Writing CSV Using BeanWriterProcessor with a Java bean
private static void write() throws IOException {
List<Person> people = Arrays.asList(
new Person("Alice", 30, "USA"),
new Person("Bob", 25, "Canada"),
new Person("Carlos", 28, "Brazil")
);
Writer fileWriter = new FileWriter("people.csv");
CsvWriterSettings settings = new CsvWriterSettings();
settings.setHeaderWritingEnabled(true);
BeanWriterProcessor<Person> processor = new BeanWriterProcessor<>(Person.class);
settings.setRowWriterProcessor(processor);
CsvWriter writer = new CsvWriter(fileWriter, settings);
writer.writeHeaders(); // Writes: name, age, country
for (Person person : people) {
writer.processRecord(person); // Process each record manually
}
writer.close();
}Using Annotations for Bean Parsing
uniVocity-parsers supports Java beans. Annotate fields with @Parsed.
@Getter
@Setter
@ToString
public class Person {
@Parsed
private String name;
@Parsed
private int age;
@Parsed
private String country;
}Now use a BeanListProcessor:
// Configure CSV writer settings
CsvParserSettings settings = new CsvParserSettings();
settings.setDelimiterDetectionEnabled(true, ','); // Set delimiter character
settings.setQuoteDetectionEnabled(true); // Set quote character
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE);
settings.setKeepQuotes(true);
settings.setNullValue("N/A"); // Set text to represent null values
settings.setEmptyValue(""); // Set text to represent empty values
settings.setSkipEmptyLines(true); // Skip empty lines when writing
settings.setHeaders( "Name", "Age", "Country"); // Define headers
settings.getFormat().setComment('#'); // Set comment character
BeanListProcessor<Person> processor = new BeanListProcessor<>(Person.class);
settings.setProcessor(processor);
CsvParser parser = new CsvParser(settings);
parser.parse(new FileReader("output.csv"));
List<Person> people = processor.getBeans();This approach keeps your code clean and object-oriented.
Handling Fixed-Width Files
Parsing fixed-width files is easy too. Define the column lengths:
FixedWidthFields lengths = new FixedWidthFields(10, 5, 20);
FixedWidthParserSettings settings = new FixedWidthParserSettings(lengths);
FixedWidthParser parser = new FixedWidthParser(settings);
List<String[]> rows = parser.parseAll(new FileReader("fixedwidth.txt"));You can also define names:
lengths.addField("Name", 10);
lengths.addField("Age", 5);
lengths.addField("Email", 20);This helps make your data more readable.
Skipping Headers and Comments
The parser can skip headers and comments automatically.
settings.setHeaderExtractionEnabled(true);
settings.setLineSeparatorDetectionEnabled(true);
settings.getFormat().setComment('#'); With this, lines starting with # will be ignored.
Validating Input
You can validate data on the fly:
settings.setProcessorErrorHandler((error, inputRow, rowIndex) -> {
System.err.println("Error at row " + rowIndex + ": " + error.getMessage());
});This feature improves data reliability.
Trimming and Null Handling
You can remove whitespaces easily:
settings.setIgnoreLeadingWhitespaces(true); settings.setIgnoreTrailingWhitespaces(true);
And handle empty strings:
settings.setNullValue("");This avoids common issues with blank data.
Parsing Large Files
uniVocity-parsers supports chunk parsing. Use row processors to avoid memory issues:
settings.setRowProcessor(new AbstractRowProcessor() {
@Override
public void rowProcessed(String[] row, ParsingContext context) {
System.out.println(Arrays.toString(row));
}
});
parser.parse(new FileReader("bigfile.csv"));This handles each row without loading everything in memory.
What’s New in Version 4.11.0?
Version 4.11.0 introduces minor updates:
- Faster column processing.
- Bug fixes for edge case delimiters.
- Enhanced annotation support.
- Better memory performance for large datasets.
You can view the changelog on the official GitHub page.
Best Practices
Follow these tips for better results:
- Always validate your input files.
- Use beans for structured data.
- Trim and clean data during parsing.
- Use row processors for large files.
- Benchmark different settings.
Real-World Use Cases
You can use uniVocity-parsers in:
- Financial data pipelines
- Log processing systems
- ETL operations
- Machine learning preprocessing
- Inventory systems
It’s robust enough for enterprise applications.
Conclusion
uniVocity-parsers makes Java data parsing effortless. It’s flexible, fast, and easy to use. Whether you deal with CSV, TSV, or fixed-width files, this library has you covered.
Version 4.11.0 builds on a strong foundation. It ensures cleaner, safer, and faster parsing for your applications.
Start using uniVocity-parsers today. You’ll never struggle with data formats again.
This article was originally published on Medium.



