Package com.hw.langchain.text.splitter
Class TextSplitter
java.lang.Object
com.hw.langchain.text.splitter.TextSplitter
- All Implemented Interfaces:
BaseDocumentTransformer
- Direct Known Subclasses:
CharacterTextSplitter
,RecursiveCharacterTextSplitter
Interface for splitting text into chunks.
- Author:
- HamaWhite
-
Field Summary
Modifier and TypeFieldDescriptionprotected boolean
If `true`, includes chunk's start index in metadataprotected int
Overlap in characters between chunks.protected int
Maximum size of chunks to return.protected boolean
Whether or not to keep the separator in the chunks.Function that measures the length of given chunks. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionCreate documents from a list of texts.mergeSplits
(List<String> splits, String separator) We now want to combine these smaller pieces into medium size chunks to send to the LLM.splitDocuments
(List<Document> documents) Split documents.Split text into multiple components.Transform sequence of documents by splitting them.
-
Field Details
-
chunkSize
protected int chunkSizeMaximum size of chunks to return. -
chunkOverlap
protected int chunkOverlapOverlap in characters between chunks. -
lengthFunction
Function that measures the length of given chunks. -
keepSeparator
protected boolean keepSeparatorWhether or not to keep the separator in the chunks. -
addStartIndex
protected boolean addStartIndexIf `true`, includes chunk's start index in metadata
-
-
Constructor Details
-
TextSplitter
public TextSplitter()
-
-
Method Details
-
splitText
Split text into multiple components. -
createDocuments
Create documents from a list of texts. -
splitDocuments
Split documents. -
mergeSplits
We now want to combine these smaller pieces into medium size chunks to send to the LLM. -
transformDocuments
Transform sequence of documents by splitting them.- Specified by:
transformDocuments
in interfaceBaseDocumentTransformer
-