Class LookaheadCharInputReader

java.lang.Object
com.univocity.parsers.common.input.LookaheadCharInputReader
All Implemented Interfaces:
CharInput, CharInputReader

public class LookaheadCharInputReader extends Object implements CharInputReader
A special implementation of CharInputReader that wraps another CharInputReader and collects a sequence of characters from the wrapped input, in order to analyze what the buffer contains ahead of the current position.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private char
     
    private int
     
    private char[]
     
    private final char
     
    private final CharInputReader
     
    private int
     
    private final int
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    LookaheadCharInputReader(CharInputReader reader, char newLine, int whitespaceRangeStart)
    Creates a lookahead input reader by wrapping a given CharInputReader implementation
  • Method Summary

    Modifier and Type
    Method
    Description
    long
    Returns the number of characters returned by CharInputReader.nextChar() at any given time.
    Returns a String with the input character sequence parsed to produce the current record.
    int
    Returns the length of the character sequence parsed to produce the current record.
    void
    enableNormalizeLineEndings(boolean escaping)
    Indicates to the input reader that the parser is running in "escape" mode and new lines should be returned as-is to prevent modifying the content of the parsed value.
    final char
    Returns the last character returned by the CharInputReader.nextChar() method.
    char[]
    Returns the line separator by this character input reader.
    Returns the current lookahead value.
    getLookahead(char current)
    Returns the lookahead value prepended with the current character
    getQuotedString(char quote, char escape, char escapeEscape, int maxLength, char stop1, char stop2, boolean keepQuotes, boolean keepEscape, boolean trimLeading, boolean trimTrailing)
    Attempts to collect a quoted String from the current position until a closing quote or stop character is found on the input, or a line ending is reached.
    getString(char ch, char stop, boolean trim, String nullValue, int maxLength)
    Attempts to collect a String from the current position until a stop character is found on the input, or a line ending is reached.
    int
    lastIndexOf(char ch)
    Returns the last index of a given character in the current parsed content
    long
    Returns the number of newlines read so far.
    void
    lookahead(int numberOfCharacters)
    Fills the lookahead buffer with a given number of characters that will be extracted from the wrapped CharInputReader
    void
    Marks the start of a new record in the input, used internally to calculate the result of CharInputReader.currentParsedContent()
    boolean
    matches(char[] sequence, char wildcard)
    Matches a sequence of characters against the current lookahead buffer.
    boolean
    matches(char current, char[] sequence, char wildcard)
    Matches a sequence of characters against the current lookahead buffer.
    char
    Returns the next character in the input provided by the active Reader.
    Collects the comment line found on the input.
    void
     
    void
    skipLines(long lineCount)
    Skips characters in the input until the given number of lines is discarded.
    boolean
    skipQuotedString(char quote, char escape, char stop1, char stop2)
    Attempts to skip a quoted String from the current position until a stop character is found on the input, or a line ending is reached.
    boolean
    skipString(char ch, char stop)
    Attempts to skip a String from the current position until a stop character is found on the input, or a line ending is reached.
    char
    skipWhitespace(char ch, char stopChar1, char stopChar2)
    Skips characters from the current input position, until a non-whitespace character, or a stop character is found
    void
    start(Reader reader)
    Initializes the CharInputReader implementation with a Reader which provides access to the input.
    void
    Stops the CharInputReader from reading characters from the Reader provided in CharInputReader.start(Reader) and closes it.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • reader

      private final CharInputReader reader
    • lookahead

      private char[] lookahead
    • length

      private int length
    • start

      private int start
    • newLine

      private final char newLine
    • delimiter

      private char delimiter
    • whitespaceRangeStart

      private final int whitespaceRangeStart
  • Constructor Details

    • LookaheadCharInputReader

      public LookaheadCharInputReader(CharInputReader reader, char newLine, int whitespaceRangeStart)
      Creates a lookahead input reader by wrapping a given CharInputReader implementation
      Parameters:
      reader - the input reader whose characters will read and stored in a limited internal buffer, in order to allow a parser to query what the characters are available ahead of the current input position.
      newLine - the normalized character that represents a line ending. Used internally as a stop character.
      whitespaceRangeStart - starting range of characters considered to be whitespace.
  • Method Details

    • matches

      public boolean matches(char current, char[] sequence, char wildcard)
      Matches a sequence of characters against the current lookahead buffer.
      Parameters:
      current - the last character used by the parser, which should match the first character in the lookahead buffer
      sequence - the expected sequence of characters after the current character, that are expected appear in the current lookahead buffer
      wildcard - character used in the sequence as a wildcard (e.g. * or ?), meaning any character is acceptable in its place.
      Returns:
      true if the current character and the sequence characters that follows are present in the lookahead, otherwise false
    • matches

      public boolean matches(char[] sequence, char wildcard)
      Matches a sequence of characters against the current lookahead buffer.
      Parameters:
      sequence - the expected sequence of characters that are expected appear in the current lookahead buffer
      wildcard - character used in the sequence as a wildcard (e.g. * or ?), meaning any character is acceptable in its place.
      Returns:
      true if the given sequence of characters is present in the lookahead, otherwise false
    • getLookahead

      public String getLookahead()
      Returns the current lookahead value.
      Returns:
      the current lookahead value, or an empty String if the lookahead buffer is empty.
    • getLookahead

      public String getLookahead(char current)
      Returns the lookahead value prepended with the current character
      Parameters:
      current - the current character obtained by the parser
      Returns:
      a String formed by the given character followed by the lookahead value (if any).
    • lookahead

      public void lookahead(int numberOfCharacters)
      Fills the lookahead buffer with a given number of characters that will be extracted from the wrapped CharInputReader
      Parameters:
      numberOfCharacters - the number of characters to read from the wrapped CharInputReader, given in the constructor of this class.
    • start

      public void start(Reader reader)
      Description copied from interface: CharInputReader
      Initializes the CharInputReader implementation with a Reader which provides access to the input.
      Specified by:
      start in interface CharInputReader
      Parameters:
      reader - A Reader that provides access to the input.
    • stop

      public void stop()
      Description copied from interface: CharInputReader
      Stops the CharInputReader from reading characters from the Reader provided in CharInputReader.start(Reader) and closes it.
      Specified by:
      stop in interface CharInputReader
    • nextChar

      public char nextChar()
      Description copied from interface: CharInputReader
      Returns the next character in the input provided by the active Reader.

      If the input contains a sequence of newline characters (defined by Format.getLineSeparator()), this method will automatically converted them to the newline character specified in Format.getNormalizedNewline().

      A subsequent call to this method will return the character after the newline sequence.

      Specified by:
      nextChar in interface CharInput
      Specified by:
      nextChar in interface CharInputReader
      Returns:
      the next character in the input. '\0' if there are no more characters in the input or if the CharInputReader was stopped.
    • charCount

      public long charCount()
      Description copied from interface: CharInputReader
      Returns the number of characters returned by CharInputReader.nextChar() at any given time.
      Specified by:
      charCount in interface CharInputReader
      Returns:
      the number of characters returned by CharInputReader.nextChar()
    • lineCount

      public long lineCount()
      Description copied from interface: CharInputReader
      Returns the number of newlines read so far.
      Specified by:
      lineCount in interface CharInputReader
      Returns:
      the number of newlines read so far.
    • skipLines

      public void skipLines(long lineCount)
      Description copied from interface: CharInputReader
      Skips characters in the input until the given number of lines is discarded.
      Specified by:
      skipLines in interface CharInputReader
      Parameters:
      lineCount - the number of lines to skip from the current location in the input
    • enableNormalizeLineEndings

      public void enableNormalizeLineEndings(boolean escaping)
      Description copied from interface: CharInputReader
      Indicates to the input reader that the parser is running in "escape" mode and new lines should be returned as-is to prevent modifying the content of the parsed value.
      Specified by:
      enableNormalizeLineEndings in interface CharInputReader
      Parameters:
      escaping - flag indicating that the parser is escaping values and line separators are to be returned as-is.
    • readComment

      public String readComment()
      Description copied from interface: CharInputReader
      Collects the comment line found on the input.
      Specified by:
      readComment in interface CharInputReader
      Returns:
      the text found in the comment from the current position.
    • getLineSeparator

      public char[] getLineSeparator()
      Description copied from interface: CharInputReader
      Returns the line separator by this character input reader. This could be the line separator defined in the Format.getLineSeparator() configuration, or the line separator sequence identified automatically when CommonParserSettings.isLineSeparatorDetectionEnabled() evaluates to true.
      Specified by:
      getLineSeparator in interface CharInputReader
      Returns:
      the line separator in use.
    • getChar

      public final char getChar()
      Description copied from interface: CharInputReader
      Returns the last character returned by the CharInputReader.nextChar() method.
      Specified by:
      getChar in interface CharInput
      Specified by:
      getChar in interface CharInputReader
      Returns:
      the last character returned by the CharInputReader.nextChar() method.'\0' if there are no more characters in the input or if the CharInputReader was stopped.
    • skipWhitespace

      public char skipWhitespace(char ch, char stopChar1, char stopChar2)
      Description copied from interface: CharInputReader
      Skips characters from the current input position, until a non-whitespace character, or a stop character is found
      Specified by:
      skipWhitespace in interface CharInputReader
      Parameters:
      ch - the current character of the input
      stopChar1 - the first stop character (which can be a whitespace)
      stopChar2 - the second character (which can be a whitespace)
      Returns:
      the first non-whitespace character (or delimiter) found in the input.
    • currentParsedContent

      public String currentParsedContent()
      Description copied from interface: CharInputReader
      Returns a String with the input character sequence parsed to produce the current record.
      Specified by:
      currentParsedContent in interface CharInputReader
      Returns:
      the text content parsed for the current input record.
    • markRecordStart

      public void markRecordStart()
      Description copied from interface: CharInputReader
      Marks the start of a new record in the input, used internally to calculate the result of CharInputReader.currentParsedContent()
      Specified by:
      markRecordStart in interface CharInputReader
    • getString

      public String getString(char ch, char stop, boolean trim, String nullValue, int maxLength)
      Description copied from interface: CharInputReader
      Attempts to collect a String from the current position until a stop character is found on the input, or a line ending is reached. If the String can be obtained, the current position of the parser will be updated to the last consumed character. If the internal buffer needs to be reloaded, this method will return null and the current position of the buffer will remain unchanged.
      Specified by:
      getString in interface CharInputReader
      Parameters:
      ch - the current character to be considered. If equal to the stop character the nullValue will be returned
      stop - the stop character that identifies the end of the content to be collected
      trim - flag indicating whether or not trailing whitespaces should be discarded
      nullValue - value to return when the length of the content to be returned is 0.
      maxLength - the maximum length of the String to be returned. If the length exceeds this limit, null will be returned
      Returns:
      the String found on the input, or null if the buffer needs to reloaded or the maximum length has been exceeded.
    • getQuotedString

      public String getQuotedString(char quote, char escape, char escapeEscape, int maxLength, char stop1, char stop2, boolean keepQuotes, boolean keepEscape, boolean trimLeading, boolean trimTrailing)
      Description copied from interface: CharInputReader
      Attempts to collect a quoted String from the current position until a closing quote or stop character is found on the input, or a line ending is reached. If the String can be obtained, the current position of the parser will be updated to the last consumed character. If the internal buffer needs to be reloaded, this method will return null and the current position of the buffer will remain unchanged.
      Specified by:
      getQuotedString in interface CharInputReader
      Parameters:
      quote - the quote character
      escape - the quote escape character
      escapeEscape - the escape of the quote escape character
      maxLength - the maximum length of the String to be returned. If the length exceeds this limit, null will be returned
      stop1 - the first stop character that identifies the end of the content to be collected
      stop2 - the second stop character that identifies the end of the content to be collected
      keepQuotes - flag to indicate the quotes that wrap the resulting String should be kept.
      keepEscape - flag to indicate that escape sequences should be kept
      trimLeading - flag to indicate leading whitespaces should be trimmed
      trimTrailing - flag to indicate that trailing whitespaces should be trimmed
      Returns:
      the String found on the input, or null if the buffer needs to reloaded or the maximum length has been exceeded.
    • currentParsedContentLength

      public int currentParsedContentLength()
      Description copied from interface: CharInputReader
      Returns the length of the character sequence parsed to produce the current record.
      Specified by:
      currentParsedContentLength in interface CharInputReader
      Returns:
      the length of the text content parsed for the current input record
    • skipString

      public boolean skipString(char ch, char stop)
      Description copied from interface: CharInputReader
      Attempts to skip a String from the current position until a stop character is found on the input, or a line ending is reached. If the String can be skipped, the current position of the parser will be updated to the last consumed character. If the internal buffer needs to be reloaded, this method will return false and the current position of the buffer will remain unchanged.
      Specified by:
      skipString in interface CharInputReader
      Parameters:
      ch - the current character to be considered. If equal to the stop character false will be returned
      stop - the stop character that identifies the end of the content to be collected
      Returns:
      true if an entire String value was found on the input and skipped, or false if the buffer needs to reloaded.
    • skipQuotedString

      public boolean skipQuotedString(char quote, char escape, char stop1, char stop2)
      Description copied from interface: CharInputReader
      Attempts to skip a quoted String from the current position until a stop character is found on the input, or a line ending is reached. If the String can be skipped, the current position of the parser will be updated to the last consumed character. If the internal buffer needs to be reloaded, this method will return false and the current position of the buffer will remain unchanged.
      Specified by:
      skipQuotedString in interface CharInputReader
      Parameters:
      quote - the quote character
      escape - the quote escape character
      stop1 - the first stop character that identifies the end of the content to be collected
      stop2 - the second stop character that identifies the end of the content to be collected
      Returns:
      true if an entire String value was found on the input and skipped, or false if the buffer needs to reloaded.
    • lastIndexOf

      public int lastIndexOf(char ch)
      Description copied from interface: CharInputReader
      Returns the last index of a given character in the current parsed content
      Specified by:
      lastIndexOf in interface CharInputReader
      Parameters:
      ch - the character to look for
      Returns:
      the last position of the given character in the current parsed content, or -1 if not found.
    • reloadBuffer

      public void reloadBuffer()