Class TikaUtils
java.lang.Object
uno.anahata.ai.internal.TikaUtils
Utility class for file type detection and content extraction using Apache Tika.
This class provides methods to identify the MIME type of a file and to extract its text content, supporting a wide range of formats (PDF, DOCX, etc.).
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic StringdetectAndParse(File file) Detects the file type and parses the text content from a given file.static StringdetectMimeType(File file) Detects the MIME type of a given file.
-
Constructor Details
-
TikaUtils
public TikaUtils()
-
-
Method Details
-
detectMimeType
-
detectAndParse
Detects the file type and parses the text content from a given file.This method uses Tika's auto-detection to choose the appropriate parser for the file format.
- Parameters:
file- The file to parse.- Returns:
- The extracted text content.
- Throws:
Exception- if an error occurs during parsing.
-
