Package com.hw.langchain.text.splitter
Class MarkdownHeaderTextSplitter
java.lang.Object
com.hw.langchain.text.splitter.MarkdownHeaderTextSplitter
Implementation of splitting markdown files based on specified headers.
- Author:
- HamaWhite
-
Constructor Summary
ConstructorDescriptionMarkdownHeaderTextSplitter
(List<org.apache.commons.lang3.tuple.Pair<String, String>> headersToSplitOn) MarkdownHeaderTextSplitter
(List<org.apache.commons.lang3.tuple.Pair<String, String>> headersToSplitOn, boolean returnEachLine) -
Method Summary
-
Constructor Details
-
MarkdownHeaderTextSplitter
-
MarkdownHeaderTextSplitter
-
-
Method Details
-
aggregateLinesToChunks
Combine lines with common metadata into chunks.- Parameters:
lines
- Line of text / associated header metadata- Returns:
- List of Document chunks
-
splitText
Split markdown file.- Parameters:
text
- Markdown file- Returns:
- List of Document chunks
-