Class FixedWidthParserSettings

All Implemented Interfaces:
Cloneable

public class FixedWidthParserSettings extends CommonParserSettings<FixedWidthFormat>
This is the configuration class used by the Fixed-Width parser (FixedWidthParser)

In addition to the configuration options provided by CommonParserSettings, the FixedWidthParserSettings include:

  • skipTrailingCharsUntilNewline (defaults to false): Indicates whether or not any trailing characters beyond the record's length should be skipped until the newline is reached

    For example, if the record length is 5, but the row contains "12345678\n", then portion containing "678" will be discarded and not considered part of the next record

  • recordEndsOnNewline (defaults to false): Indicates whether or not a record is considered parsed when a newline is reached.

    For example, if recordEndsOnNewline is set to true, then given a record of length 4, and the input "12\n3456", the parser will identify [12] and [3456]

    If recordEndsOnNewline is set to false, then given a record of length 4, and the input "12\n3456", the parser will identify a multi-line record [12\n3] and [456 ]

The FixedWidthParserSettings need a definition of the field lengths of each record in the input. This must provided using an instance of FixedWidthFields.

See Also:
  • Field Details

    • skipTrailingCharsUntilNewline

      protected boolean skipTrailingCharsUntilNewline
    • recordEndsOnNewline

      protected boolean recordEndsOnNewline
    • useDefaultPaddingForHeaders

      private boolean useDefaultPaddingForHeaders
    • keepPadding

      private boolean keepPadding
    • fieldLengths

      private FixedWidthFields fieldLengths
    • lookaheadFormats

      private Map<String,FixedWidthFields> lookaheadFormats
    • lookbehindFormats

      private Map<String,FixedWidthFields> lookbehindFormats
  • Constructor Details

    • FixedWidthParserSettings

      public FixedWidthParserSettings(FixedWidthFields fieldLengths)
      You can only create an instance of this class by providing a definition of the field lengths of each record in the input.

      This must provided using an instance of FixedWidthFields.

      Parameters:
      fieldLengths - the instance of FixedWidthFields which provides the lengths of each field in the fixed-width records to be parsed
      See Also:
    • FixedWidthParserSettings

      public FixedWidthParserSettings()
      Creates a basic configuration object for the Fixed-Width parser with no field length configuration. This constructor is intended to be used when the record length varies depending of the input row. Refer to addFormatForLookahead(String, FixedWidthFields), addFormatForLookbehind(String, FixedWidthFields)
  • Method Details

    • getFieldLengths

      int[] getFieldLengths()
      Returns the sequence of lengths to be read by the parser to form a record.
      Returns:
      the sequence of lengths to be read by the parser to form a record.
    • getAllLengths

      int[] getAllLengths()
    • getFieldPaddings

      char[] getFieldPaddings()
      Returns the sequence of paddings used by each field of each record.
      Returns:
      the sequence of paddings used by each field of each record.
    • getFieldsToIgnore

      boolean[] getFieldsToIgnore()
      Returns the sequence of fields to ignore.
      Returns:
      the sequence of fields to ignore.
    • getFieldAlignments

      FieldAlignment[] getFieldAlignments()
      Returns the sequence of alignments to consider for each field of each record.
      Returns:
      the sequence of alignments to consider for each field of each record.
    • getSkipTrailingCharsUntilNewline

      public boolean getSkipTrailingCharsUntilNewline()
      Indicates whether or not any trailing characters beyond the record's length should be skipped until the newline is reached (defaults to false)

      For example, if the record length is 5, but the row contains "12345678\n", then the portion containing "678\n" will be discarded and not considered part of the next record

      Returns:
      returns true if any trailing characters beyond the record's length should be skipped until the newline is reached, false otherwise
    • setSkipTrailingCharsUntilNewline

      public void setSkipTrailingCharsUntilNewline(boolean skipTrailingCharsUntilNewline)
      Defines whether or not any trailing characters beyond the record's length should be skipped until the newline is reached (defaults to false)

      For example, if the record length is 5, but the row contains "12345678\n", then the portion containing "678\n" will be discarded and not considered part of the next record

      Parameters:
      skipTrailingCharsUntilNewline - a flag indicating if any trailing characters beyond the record's length should be skipped until the newline is reached
    • getRecordEndsOnNewline

      public boolean getRecordEndsOnNewline()
      Indicates whether or not a record is considered parsed when a newline is reached. Examples:
      • Consider two records of length 4, and the input 12\n3456
      • When recordEndsOnNewline is set to true: the first value will be read as 12 and the second 3456
      • When recordEndsOnNewline is set to false: the first value will be read as 12\n3 and the second 456

      defaults to false

      Returns:
      true if a record should be considered parsed when a newline is reached; false otherwise
    • setRecordEndsOnNewline

      public void setRecordEndsOnNewline(boolean recordEndsOnNewline)
      Defines whether or not a record is considered parsed when a newline is reached. Examples:
      • Consider two records of length 4, and the input 12\n3456
      • When recordEndsOnNewline is set to true: the first value will be read as 12 and the second 3456
      • When recordEndsOnNewline is set to false: the first value will be read as 12\n3 and the second 456
      Parameters:
      recordEndsOnNewline - a flag indicating whether or not a record is considered parsed when a newline is reached
    • createDefaultFormat

      protected FixedWidthFormat createDefaultFormat()
      Returns the default FixedWidthFormat configured to handle Fixed-Width inputs
      Specified by:
      createDefaultFormat in class CommonSettings<FixedWidthFormat>
      Returns:
      and instance of FixedWidthFormat configured to handle Fixed-Width inputs
    • newCharAppender

      protected CharAppender newCharAppender()
      Returns an instance of CharAppender with the configured limit of maximum characters per column and, default value used to represent a null value (when the String parsed from the input is empty), and the padding character to handle unwritten positions

      This overrides the parent implementation to create a CharAppender capable of handling padding characters that represent unwritten positions.

      Overrides:
      newCharAppender in class CommonParserSettings<FixedWidthFormat>
      Returns:
      an instance of CharAppender with the configured limit of maximum characters per column and, default value used to represent a null value (when the String parsed from the input is empty), and the padding character to handle unwritten positions
    • getMaxCharsPerColumn

      public int getMaxCharsPerColumn()
      The maximum number of characters allowed for any given value being written/read. Used to avoid OutOfMemoryErrors (defaults to a minimum of 4096 characters).

      This overrides the parent implementation and calculates the absolute minimum number of characters required to store the values of a record

      If the sum of all field lengths is greater than the configured maximum number of characters per column, the calculated amount will be returned.

      Overrides:
      getMaxCharsPerColumn in class CommonSettings<FixedWidthFormat>
      Returns:
      The maximum number of characters allowed for any given value being written/read
    • getMaxColumns

      public int getMaxColumns()
      Returns the hard limit of how many columns a record can have (defaults to a maximum of 512). You need this to avoid OutOfMemory errors in case of inputs that might be inconsistent with the format you are dealing with.

      This overrides the parent implementation and calculates the absolute minimum number of columns required to store the values of a record

      If the sum of all fields is greater than the configured maximum number columns, the calculated amount will be returned.

      Overrides:
      getMaxColumns in class CommonSettings<FixedWidthFormat>
      Returns:
      The maximum number of columns a record can have.
    • calculateMaxFieldLengths

      private int[] calculateMaxFieldLengths()
    • getLookaheadFormats

      Lookup[] getLookaheadFormats()
    • getLookbehindFormats

      Lookup[] getLookbehindFormats()
    • addFormatForLookahead

      public void addFormatForLookahead(String lookahead, FixedWidthFields lengths)
      Defines the format of records identified by a lookahead symbol.
      Parameters:
      lookahead - the lookahead value that when found in the input, will notify the parser to switch to a new record format, with different field lengths
      lengths - the field lengths of the record format identified by the given lookahead symbol.
    • addFormatForLookbehind

      public void addFormatForLookbehind(String lookbehind, FixedWidthFields lengths)
      Defines the format of records identified by a lookbehind symbol.
      Parameters:
      lookbehind - the lookbehind value that when found in the previous input row, will notify the parser to switch to a new record format, with different field lengths
      lengths - the field lengths of the record format identified by the given lookbehind symbol.
    • getUseDefaultPaddingForHeaders

      public boolean getUseDefaultPaddingForHeaders()
      Indicates whether headers should be parsed using the default padding specified in FixedWidthFormat.getPadding() instead of any custom padding associated with a given field (in FixedWidthFields.setPadding(char, int...)) Defaults to true
      Returns:
      true if the default padding is to be used when reading headers, otherwise false
    • setUseDefaultPaddingForHeaders

      public void setUseDefaultPaddingForHeaders(boolean useDefaultPaddingForHeaders)
      Defines whether headers should be parsed using the default padding specified in FixedWidthFormat.getPadding() instead of any custom padding associated with a given field (in FixedWidthFields.setPadding(char, int...))
      Parameters:
      useDefaultPaddingForHeaders - flag indicating whether the default padding is to be used when parsing headers
    • configureFromAnnotations

      protected void configureFromAnnotations(Class<?> beanClass)
      Description copied from class: CommonParserSettings
      Configures the parser based on the annotations provided in a given class
      Overrides:
      configureFromAnnotations in class CommonParserSettings<FixedWidthFormat>
      Parameters:
      beanClass - the classes whose annotations will be processed to derive configurations for parsing
    • addConfiguration

      protected void addConfiguration(Map<String,Object> out)
      Overrides:
      addConfiguration in class CommonParserSettings<FixedWidthFormat>
    • clone

      public final FixedWidthParserSettings clone()
      Clones this configuration object to reuse all user-provided settings, including the fixed-width field configuration.
      Overrides:
      clone in class CommonParserSettings<FixedWidthFormat>
      Returns:
      a copy of all configurations applied to the current instance.
    • clone

      @Deprecated protected final FixedWidthParserSettings clone(boolean clearInputSpecificSettings)
      Deprecated.
      doesn't really make sense for fixed-width. Use alternative method clone(FixedWidthFields).
      Clones this configuration object to reuse most user-provided settings. This includes the fixed-width field configuration, but doesn't include other input-specific settings. This method is meant to be used internally only.
      Overrides:
      clone in class CommonParserSettings<FixedWidthFormat>
      Parameters:
      clearInputSpecificSettings - flag indicating whether to clear settings that are likely to be associated with a given input.
      Returns:
      a copy of all configurations applied to the current instance.
    • clone

      public final FixedWidthParserSettings clone(FixedWidthFields fields)
      Clones this configuration object to reuse most user-provided settings. Properties that are specific to a given input (such as header names and selection of fields) will be reset to their defaults. To obtain a full copy, use clone().
      Parameters:
      fields - the fixed-width field configuration to be used by the cloned settings object.
      Returns:
      a copy of the general configurations applied to the current instance.
    • clone

      private FixedWidthParserSettings clone(boolean clearInputSpecificSettings, FixedWidthFields fields)
    • getKeepPadding

      public final boolean getKeepPadding()
      Indicate the padding character should be kept in the parsed value (defaults to false) This setting can be overridden for individual fields through FixedWidthFields.stripPaddingFrom(String, String...) and FixedWidthFields.keepPaddingOn(String, String...)
      Returns:
      flag indicating the padding character should be kept in the parsed value
    • setKeepPadding

      public final void setKeepPadding(boolean keepPadding)
      Configures the fixed-width parser to retain the padding character in any parsed values (defaults to false) This setting can be overridden for individual fields through FixedWidthFields.stripPaddingFrom(String, String...) and FixedWidthFields.keepPaddingOn(String, String...)
      Parameters:
      keepPadding - flag indicating the padding character should be kept in the parsed value
    • getKeepPaddingFlags

      Boolean[] getKeepPaddingFlags()
      Returns the sequence of fields whose padding character must/must not be retained in the parsed value
      Returns:
      the sequence that have an explicit 'keepPadding' flag.