Idris2Doc : Data.ByteString.Search.DFA

Data.ByteString.Search.DFA

(source)
Fast deterministic finite automaton (DFA) search of ByteStrings

Definitions

matchDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ListNat)
  Performs a string search on a `ByteString` utilizing a determinisitic-finite-automaton (DFA).

This function finds all (0-based) starting indices of the non-empty pattern `ByteString`
pat within the non-empty target `ByteString`, using the deterministic-finite-automaton
(DFA) computed by `automaton`.

Example:

| pat | target |
| ---- | ---------- |
| "AN" | "ANPANMAN" |

| Start | Substring | Match? | Explanation |
| ----- | -------------- | ------ | ------------------------------------------------ |
| 0 | **"AN"**PANMAN | Yes | Full pattern `"AN"` matches starting at index 0. |
| 1 | A**"NP"**ANMAN | No | Mismatch after the first character. |
| 2 | AN**"PA"**NMAN | No | No match — next candidate after suffix shift. |
| 3 | ANP**"AN"**MAN | Yes | Match found at index 3. |
| 4 | ANPA**"NM"**AN | No | Mismatch. |
| 5 | ANPAN**"MA"**N | No | Mismatch. |
| 6 | ANPANM**"AN"** | Yes | Final match found at index 6. |


matchDFA "AN" "ANPANMAN" => [0, 3, 6]

Totality: total
Visibility: export
indicesDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ListNat)
  Performs a string search on a `ByteString` utilizing a determinisitic-finite-automaton (DFA).

This function finds all (0-based) indices (possibly overlapping)
of the non-empty pattern `ByteString` pat
within the non-empty target `ByteString`, using the deterministic-finite-automaton
(DFA) computed by `automaton`.

Example:

| pat | target |
| -------- | ----------- |
| "ABCABC" | "ABCABCABC" |

| Start | Substring | Match? | Explanation |
| ----- | --------------- | ------ | ---------------------------------------------------------------- |
| 0 | **"ABCABC"**ABC | Yes | Full pattern matches starting at index 0. |
| 1 | A**"BCABCA"**BC | No | Mismatch starts immediately after first letter. |
| 2 | AB**"CABCAA"**C | No | Shift by suffix table → mismatch on 2nd char. |
| 3 | ABC**"ABC"** | Yes | Overlapping match starting at index 3 (because `"ABC"` repeats). |

indicesDFA "ABCABC" "ABCABCABC" => [0, 3]

Totality: total
Visibility: export
breakDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ByteString, ByteString)
  Splits a ByteString at the first match of pat in target.

This function uses the deterministic-finite-automaton matcher (with overlap = False) to
locate the earliest occurrence of pat in target. If the pattern is
found at index i, the pattern ByteString pat is split at that index,
returning the prefix and suffix as a pair (before, after).

If the pattern does not occur in the target, (pat, empty) is returned.
In other words, the entire pattern becomes the “before” part and the
“after” part is an empty ByteString.

Totality: total
Visibility: export
breakAfterDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ByteString, ByteString)
  Splits a ByteString after the first match of pat in target.

This function uses the deterministic-finite-automaton matcher (with overlap = False) to
find the earliest occurrence of pat in target. If the pattern is
found at index i, this function splits pat at position i + length pat,
producing a pair (before, after) that places the entire matched region
into the prefix.

If the pattern does not occur in target, the function returns
(pat, empty), the entire pattern is the “before” substring, and the
suffix is empty.

Totality: total
Visibility: export
splitKeepFrontDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ListByteString)
  Splits a ByteString into a list of pieces according to repeated
matches of target, keeping the matching prefix of pat
at the front of each produced chunk.

This function repeatedly searches target for occurrences of pat
(using the deterministic-finite-automaton matcher with overlap = False). Each time a
match is found at index i, the prefix of pat up to i + length pat
is emitted as the next chunk, and the function continues processing the
remaining suffix of pat.

Unlike breakDFA or breakAfterDFA, this function performs repeated
splitting until the entire pattern has been consumed, producing a
list of ByteStrings.

Totality: total
Visibility: export
splitKeepEndDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ListByteString)
  Splits a ByteString into a list of pieces according to repeated
matches of pat inside target, keeping the matching
suffix of pat at the end of each produced chunk.

This function repeatedly searches target for occurrences of pat
(using the deterministic-finite-automaton matcher with overlap = False). Each time a
match is found at index i, the next chunk emitted is the prefix of
target of length i + length pat, which includes the entire matched
occurrence of pat at its end.

After emitting this chunk, the function continues splitting the
remainder of target until all input has been consumed.

Unlike splitKeepFrontDFA, which keeps the matched prefix of pat
at the front of each chunk, splitKeepEndDFA ensures the match
appears at the end of each chunk.

If pat does not occur in target, the result is a singleton list
containing the original target.

Totality: total
Visibility: export
splitDropDFA : (pat : ByteString) -> (target : ByteString) ->F1s (ListByteString)
  Splits a ByteString into a list of pieces according to repeated
matches of pat inside target, dropping each matched
occurrence from the output entirely.

This function repeatedly searches target for occurrences of pat
(using the deterministic-finite-automaton matcher with overlap = False). Each time a
match is found at index i, the prefix of target of length i
(that is, the portion preceding the match) is emitted as the next
chunk. The matched substring itself is not included.

After emitting this prefix, the function continues splitting the
remainder of target, skipping over the full match of length
i + length pat. This process continues until the entire target
has been consumed.

Unlike splitKeepFrontDFA and splitKeepEndDFA, which include the
matched pattern in each emitted chunk, splitDropKMP removes all
occurrences of pat from the output.

If pat does not occur in target, the result is a singleton list
containing the original target.

Totality: total
Visibility: export
replaceDFA : (pat : ByteString) ->ByteString-> (target : ByteString) ->F1s (ListByteString)
  Replaces all non-overlapping occurrences of a pattern in a ByteString
using the deterministic-finite-automaton matcher.

This function repeatedly searches target for occurrences of pat
(using matcher False). Each time a match is found at index i:

* If i == 0, the match is at the current position. The matched
segment is dropped and sub is appended to the result (unless
sub is empty, in which case nothing is appended).

* If i > 0, the prefix take i target is appended to the result,
followed by sub (unless sub is empty). The matched segment is
then dropped and processing continues on the remaining suffix.

If no further matches are found, the remaining target is appended
unchanged and the result is returned.

The result is accumulated via a `SnocList` and returned as a `List
ByteString`, preserving left-to-right order of the produced chunks.

Totality: total
Visibility: export