Class CsvParser
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate boolean
private char
private char[]
private final boolean
private final String
private char
private int
private final boolean
private final boolean
private int
private final int
private char[]
private char
private final boolean
private final String
private boolean
private boolean
private char
private char
private char
private UnescapedQuoteHandling
private final boolean
private final boolean
private boolean
private final DefaultCharAppender
Fields inherited from class com.univocity.parsers.common.AbstractParser
ch, comment, comments, context, ignoreLeadingWhitespace, ignoreTrailingWhitespace, input, lastComment, output, processor, settings, whitespaceRangeStart
-
Constructor Summary
ConstructorsConstructorDescriptionCsvParser
(CsvParserSettings settings) The CsvParser supports all settings provided byCsvParserSettings
, and requires this configuration to be properly initialized. -
Method Summary
Modifier and TypeMethodDescriptionprivate void
protected final boolean
Allows the parser implementation to handle any value that was being consumed when the end of the input was reachedfinal CsvFormat
Returns the CSV format detected when one of the following settings is enabled:CommonParserSettings.isLineSeparatorDetectionEnabled()
CsvParserSettings.isDelimiterDetectionEnabled()
CsvParserSettings.isQuoteDetectionEnabled()
The detected format will be available once the parsing process is initialized (i.e.protected final InputAnalysisProcess
Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.private boolean
private void
private void
handleValueSkipping
(boolean quoted) private boolean
private boolean
private int
private void
private void
private void
protected final void
Parser-specific implementation for reading a single record from the input.private final void
private void
private void
private void
private void
private void
private void
final void
updateFormat
(CsvFormat format) Allows changing the format of the input on the fly.Methods inherited from class com.univocity.parsers.common.AbstractParser
beginParsing, beginParsing, beginParsing, beginParsing, beginParsing, beginParsing, beginParsing, createParsingContext, getContext, getRecordMetadata, inComment, initialize, iterate, iterate, iterate, iterate, iterate, iterate, iterate, iterateRecords, iterateRecords, iterateRecords, iterateRecords, iterateRecords, iterateRecords, iterateRecords, parse, parse, parse, parse, parse, parse, parse, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseLine, parseNext, parseNextRecord, parseRecord, processComment, reloadHeaders, stopParsing
-
Field Details
-
parseUnescapedQuotes
private boolean parseUnescapedQuotes -
parseUnescapedQuotesUntilDelimiter
private boolean parseUnescapedQuotesUntilDelimiter -
backToDelimiter
private boolean backToDelimiter -
doNotEscapeUnquotedValues
private final boolean doNotEscapeUnquotedValues -
keepEscape
private final boolean keepEscape -
keepQuotes
private final boolean keepQuotes -
unescaped
private boolean unescaped -
prev
private char prev -
delimiter
private char delimiter -
multiDelimiter
private char[] multiDelimiter -
quote
private char quote -
quoteEscape
private char quoteEscape -
escapeEscape
private char escapeEscape -
newLine
private char newLine -
whitespaceAppender
-
normalizeLineEndingsInQuotes
private final boolean normalizeLineEndingsInQuotes -
quoteHandling
-
nullValue
-
maxColumnLength
private final int maxColumnLength -
emptyValue
-
trimQuotedLeading
private final boolean trimQuotedLeading -
trimQuotedTrailing
private final boolean trimQuotedTrailing -
delimiters
private char[] delimiters -
match
private int match -
formatDetectorRowSampleCount
private int formatDetectorRowSampleCount
-
-
Constructor Details
-
CsvParser
The CsvParser supports all settings provided byCsvParserSettings
, and requires this configuration to be properly initialized.- Parameters:
settings
- the parser configuration
-
-
Method Details
-
parseRecord
protected final void parseRecord()Description copied from class:AbstractParser
Parser-specific implementation for reading a single record from the input.The AbstractParser handles the initialization and processing of the input until it is ready to be parsed.
It then delegates the input to the parser-specific implementation defined by
AbstractParser.parseRecord()
. In general, an implementation ofAbstractParser.parseRecord()
will perform the following steps:- Test the character stored in ch and take some action on it (e.g. is while (ch != '\n'){doSomething()})
- Request more characters by calling ch = input.nextChar();
- Append the desired characters to the output by executing, for example, output.appender.append(ch)
- Notify a value of the record has been fully read by executing output.valueParsed(). This will clear the output appender (
CharAppender
) so the next call to output.appender.append(ch) will be store the character of the next parsed value - Rinse and repeat until all values of the record are parsed
Once the
AbstractParser.parseRecord()
returns, the AbstractParser takes over and handles the information (generally, reorganizing it and passing it on to aRowProcessor
).After the record processing, the AbstractParser reads the next characters from the input, delegating control again to the parseRecord() implementation for processing of the next record.
This cycle repeats until the reading process is stopped by the user, the input is exhausted, or an error happens.
In case of errors, the unchecked exception
TextParsingException
will be thrown and all resources in use will be closed automatically unlessCommonParserSettings.isAutoClosingEnabled()
evaluates tofalse
. The exception should contain the cause and more information about where in the input the error happened.- Specified by:
parseRecord
in classAbstractParser<CsvParserSettings>
- See Also:
-
parseSingleDelimiterRecord
private final void parseSingleDelimiterRecord() -
skipValue
private void skipValue() -
handleValueSkipping
private void handleValueSkipping(boolean quoted) -
handleUnescapedQuoteInValue
private void handleUnescapedQuoteInValue() -
nextDelimiter
private int nextDelimiter() -
handleUnescapedQuote
private boolean handleUnescapedQuote() -
processQuoteEscape
private void processQuoteEscape() -
parseValueProcessingEscape
private void parseValueProcessingEscape() -
parseQuotedValue
private void parseQuotedValue() -
getInputAnalysisProcess
Description copied from class:AbstractParser
Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.- Overrides:
getInputAnalysisProcess
in classAbstractParser<CsvParserSettings>
- Returns:
- a custom implementation of
InputAnalysisProcess
. By default,null
is returned and no special input analysis will be performed.
-
getDetectedFormat
Returns the CSV format detected when one of the following settings is enabled:CommonParserSettings.isLineSeparatorDetectionEnabled()
CsvParserSettings.isDelimiterDetectionEnabled()
CsvParserSettings.isQuoteDetectionEnabled()
runs
.- Returns:
- the detected CSV format, or
null
if no detection has been enabled or if the parsing process has not been started yet.
-
consumeValueOnEOF
protected final boolean consumeValueOnEOF()Description copied from class:AbstractParser
Allows the parser implementation to handle any value that was being consumed when the end of the input was reached- Overrides:
consumeValueOnEOF
in classAbstractParser<CsvParserSettings>
- Returns:
- a flag indicating whether the parser was processing a value when the end of the input was reached.
-
updateFormat
Allows changing the format of the input on the fly.- Parameters:
format
- the new format to use.
-
skipWhitespace
private void skipWhitespace() -
saveMatchingCharacters
private void saveMatchingCharacters() -
matchDelimiter
private boolean matchDelimiter() -
matchDelimiterAfterQuote
private boolean matchDelimiterAfterQuote() -
parseMultiDelimiterRecord
private void parseMultiDelimiterRecord() -
appendUntilMultiDelimiter
private void appendUntilMultiDelimiter() -
parseQuotedValueMultiDelimiter
private void parseQuotedValueMultiDelimiter() -
parseValueProcessingEscapeMultiDelimiter
private void parseValueProcessingEscapeMultiDelimiter()
-