Package com.hw.langchain.text.splitter
Class TextSplitter
java.lang.Object
com.hw.langchain.text.splitter.TextSplitter
- All Implemented Interfaces:
BaseDocumentTransformer
- Direct Known Subclasses:
CharacterTextSplitter,RecursiveCharacterTextSplitter
Interface for splitting text into chunks.
- Author:
- HamaWhite
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected booleanIf `true`, includes chunk's start index in metadataprotected intOverlap in characters between chunks.protected intMaximum size of chunks to return.protected booleanWhether or not to keep the separator in the chunks.Function that measures the length of given chunks. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionCreate documents from a list of texts.mergeSplits(List<String> splits, String separator) We now want to combine these smaller pieces into medium size chunks to send to the LLM.splitDocuments(List<Document> documents) Split documents.Split text into multiple components.Transform sequence of documents by splitting them.
-
Field Details
-
chunkSize
protected int chunkSizeMaximum size of chunks to return. -
chunkOverlap
protected int chunkOverlapOverlap in characters between chunks. -
lengthFunction
Function that measures the length of given chunks. -
keepSeparator
protected boolean keepSeparatorWhether or not to keep the separator in the chunks. -
addStartIndex
protected boolean addStartIndexIf `true`, includes chunk's start index in metadata
-
-
Constructor Details
-
TextSplitter
public TextSplitter()
-
-
Method Details
-
splitText
Split text into multiple components. -
createDocuments
Create documents from a list of texts. -
splitDocuments
Split documents. -
mergeSplits
We now want to combine these smaller pieces into medium size chunks to send to the LLM. -
transformDocuments
Transform sequence of documents by splitting them.- Specified by:
transformDocumentsin interfaceBaseDocumentTransformer
-