Delimited flat-file parsing often leads to brittle index-based code. In this post, I show how enums make field positions easier to read and maintain.

In the examples below, we assume the input has already been split:

String[] split = delimitedData.split("\\|");

There are caveats of using the split method this way, but they are outside the scope of this post.

Direct indexing

DbData existingData = dbHandler.getExistingData(
    split[2],
    split[7],
    split[8],
    split[9]
);

Direct indexing is compact, but brittle and hard to scan.

Local variables

String id = split[2];
String date = split[7];
String time = split[8];
String reason = split[9];

DbData existingData = dbHandler.getExistingData(
    id, 
    date, 
    time, 
    reason
);

Local variables improves readability at the call site, but field mappings are still scattered across the code base.

Enum mapping

public enum FlatFileField {
    ID(2),
    DATE(7),
    TIME(8),
    REASON(9);

    private final int index;

    FlatFileField(int index) {
        this.index = index;
    }

    public int index() {
        return index;
    }
}

DbData existingData = dbHandler.getExistingData(
    split[FlatFileField.ID.index()],
    split[FlatFileField.DATE.index()],
    split[FlatFileField.TIME.index()],
    split[FlatFileField.REASON.index()]
);

Comparison

Approach Pros Cons
Direct indexing Concise Uses magic numbers, hard to maintain, higher cognitive load
Local variables Readable at call site Field mapping still scattered
Enum mapping Centralized field positions, clearer intent Require an additional enum

Takeaway

Enums are a simple way to replace magic numbers with meaningful names when working with delimited data. They improve readability and centralize field positions. When parsing logic grows beyond simple positional access, a dedicated parser or DTO is usually a better choice.

  • nark3d@thelemmy.club
    link
    fedilink
    arrow-up
    2
    ·
    2 天前

    The enum is a real improvement over bare integer indices, the call site reads as a name rather than a magic 7. The bit I’d watch is what the enum actually maps to. If it maps to a fixed offset you’ve named the brittleness rather than removed it, since a reordered column still breaks it silently. If it maps to a field identity, and you resolve the offset from a header or a known layout, the name carries the meaning and the position is free to move without taking the parser down.

    • The frustrated developer@programming.devOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 天前

      You can absolutely use ordinals, especially when mapping every field in the split, but my recommendation then would be to add a comment after each constant in the enum, to indicate the number of the ordinal. This will help when mapping the constants to numbers and vice versa, making it easier to follow code flows.

      The explicit mapping is there to reduce the cognitive load and also because this mapping only deal with four of the fields.

  • litchralee@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 天前

    I have two questions: 1) was this written using an AI/LLM? And 2) can you give an example input text that this parser is meant to parse?

    • The frustrated developer@programming.devOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 天前

      Not written, but it has been used to edit the original article to reduce length and make it more concise.

      This code would parse pipe-separated data such as:

      "foo|bar|id2|foo|bar|foo|bar|2026-06-03|23:59:00|random reason"

      • litchralee@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        arrow-down
        1
        ·
        3 天前

        If the data is already delimited, why does the implementation need to have fixed offsets? Why not just count the fields, as 0, 1, 2, etc?

        Your post speaks of brittle code but using fixed offsets is almost always guaranteed to be brittle.