WPS4Text Class Reference

The class which parses text zones in a pc MS Works document v1-4. More...

#include <WPS4Text.h>

Inheritance diagram for WPS4Text:
WPSTextParser

Public Member Functions

 WPS4Text (WPS4Parser &parser, RVNGInputStreamPtr &input)
 contructor
 ~WPS4Text () override
 destructor
void setListener (WPSContentListenerPtr &listen)
 sets the listener
int numPages () const
 returns the number of pages
void flushExtra ()
 sends the data which have not yet been sent to the listener
void sendObjects (int page)
 send all the objects with page anchor corresponding given page
Public Member Functions inherited from WPSTextParser
virtual ~WPSTextParser ()
 virtual destructor
int version () const
 returns the file version
RVNGInputStreamPtrgetInput ()
 returns the actual input

Protected Types

typedef bool(WPS4Text::* DataParser) (long bot, long eot, int id, long endPos, std::string &mess)
 definition of the plc data parser (low level)
Protected Types inherited from WPSTextParser
typedef bool(WPSTextParser::* FDPParser) (long endPos, int &id, std::string &mess)
 callback when a new attribute is found in an FDPP/FDPC entry

Protected Member Functions

WPS4ParsermainParser ()
 return the main parser
WPS4Parser const & mainParser () const
 return the main parser
WPS4TextInternal::Font getDefaultFont () const
 returns the default font to use for the document
WPSEntry getHeaderEntry () const
 returns the header entry (if such entry exists, if not returns an invalid entry)
WPSEntry getFooterEntry () const
 returns the footer entry (if such entry exists, if not returns an invalid entry)
WPSEntry getMainTextEntry () const
 returns the main text entry (if such entry exists, if not returns an invalid entry)
bool readText (WPSEntry const &entry)
 reads a text section and sends it to a listener
bool readEntries ()
 finds all text entries (TEXT, SHdr, SFtr, BTEC, BTEP, FTNp, FTNp, BKMK, FONT, CHRT)
bool readStructures ()
 parsed all the text entries
bool findFDPStructures (int which)
bool findFDPStructuresByHand (int which)
bool readPLC (WPSEntry const &zone, std::vector< long > &textPtrs, std::vector< long > &listValues, DataParser parser=nullptr)
 reads a PLC (Pointer List Composant ?) in zone entry
bool defDataParser (long bot, long eot, int id, long endPos, std::string &mess)
 default plc reader
bool readFontNames (WPSEntry const &entry)
 reads the font names
bool readFont (long endPos, int &id, std::string &mess)
 reads a font properties
bool readParagraph (long endPos, int &id, std::string &mess)
 reads a paragraph properties
bool readDosLink (WPSEntry const &entry)
 reads the ZZDLink ( a list of filename )
bool objectDataParser (long bot, long eot, int id, long endPos, std::string &mess)
 reads a object properties ( position in text, size and definition in file)
bool readFootNotes (WPSEntry const &ftnD, WPSEntry const &ftnP)
 reads the footnotes positions and definitions ( zones FTNd and FTNp)
bool footNotesDataParser (long bot, long eot, int id, long endPos, std::string &mess)
 reads a book mark property ( string)
bool bkmkDataParser (long bot, long eot, int id, long endPos, std::string &mess)
 reads a book mark property ( string)
bool dttmDataParser (long bot, long eot, int id, long endPos, std::string &mess)
 reads a date time property
Protected Member Functions inherited from WPSTextParser
 WPSTextParser (WPSParser &parser, RVNGInputStreamPtr &input)
 constructor
std::multimap< std::string, WPSEntry > & getNameEntryMap ()
 returns the map type->entry
std::multimap< std::string, WPSEntry > const & getNameEntryMap () const
 returns the map type->entry
std::vector< DataFODmergeSortedFODLists (std::vector< DataFOD > const &lst1, std::vector< DataFOD > const &lst2) const
 function which takes two sorted list of attribute (by text position).
bool readFDP (WPSEntry const &entry, std::vector< DataFOD > &fods, FDPParser parser)
 parses a FDPP or a FDPC entry (which contains a list of ATTR_TEXT/ATTR_PARAG with their definition ) and adds found data in listFODs
libwps::DebugFileascii ()
 a DebugFile used to write what we recognize when we parse the document

Protected Attributes

WPSContentListenerPtr m_listener
 the listener
std::shared_ptr< WPS4TextInternal::Statem_state
 the internal state
Protected Attributes inherited from WPSTextParser
int m_version
 the file version
RVNGInputStreamPtr m_input
 the main input
WPSParserm_mainParser
 pointer to the main zone parser;
WPSEntry m_textPositions
 an entry which corresponds to the complete text zone
std::vector< DataFODm_FODList
 the list of a FOD
libwps::DebugFilem_asciiFile
 the ascii file

Friends

class WPS4Parser

Detailed Description

The class which parses text zones in a pc MS Works document v1-4.

This class must be associated with a WPS4Parser. It finds and reads:

  • TEXT[3] : the text limits ( header, footer, main text with notes)
  • SHdr, SFtr : a string to store header/footer in v1-2 (?)
  • BTEC : the fonts properties
  • BTEP : the paragraph properties
  • FONT : the font names
  • FTNp, FTNd : the footnote positions (text position and text of notes)
  • BKMK : a comment field ( contain a string )
  • CHRT : a chart ( unknown format ) It reads:
  • DTTM : field contents ( only parsed)
  • EOBJ : the text position with the position and size of an object
Note
It also reads the size of the document because this size is stored between the "entries" which defines the text positions and the BTEC positions...

Member Typedef Documentation

◆ DataParser

typedef bool(WPS4Text::* WPS4Text::DataParser) (long bot, long eot, int id, long endPos, std::string &mess)
protected

definition of the plc data parser (low level)

Parameters
endPosthe end of the properties' definition,
botdefines the begin of the text's zone
eotdefines the end of the text's zone
idthe number of this properties
messa string which can be filled to indicate unparsed data

Constructor & Destructor Documentation

◆ WPS4Text()

WPS4Text::WPS4Text ( WPS4Parser & parser,
RVNGInputStreamPtr & input )

contructor

◆ ~WPS4Text()

WPS4Text::~WPS4Text ( )
override

destructor

Member Function Documentation

◆ bkmkDataParser()

bool WPS4Text::bkmkDataParser ( long bot,
long eot,
int id,
long endPos,
std::string & mess )
protected

reads a book mark property ( string)

Referenced by readStructures().

◆ defDataParser()

bool WPS4Text::defDataParser ( long bot,
long eot,
int id,
long endPos,
std::string & mess )
protected

default plc reader

Referenced by readPLC().

◆ dttmDataParser()

bool WPS4Text::dttmDataParser ( long bot,
long eot,
int id,
long endPos,
std::string & mess )
protected

reads a date time property

Referenced by readStructures().

◆ findFDPStructures()

bool WPS4Text::findFDPStructures ( int which)
protected

finds the FDPC/FDPP structure using the BTEC/BTEP entries

Parameters
which== 0 means FDPP

Referenced by readStructures().

◆ findFDPStructuresByHand()

bool WPS4Text::findFDPStructuresByHand ( int which)
protected

finds the FDPC/FDPP structure by searching after the text zone

Parameters
which== 0 means FDPP

Referenced by readStructures().

◆ flushExtra()

void WPS4Text::flushExtra ( )

sends the data which have not yet been sent to the listener

◆ footNotesDataParser()

bool WPS4Text::footNotesDataParser ( long bot,
long eot,
int id,
long endPos,
std::string & mess )
protected

reads a book mark property ( string)

Referenced by readFootNotes().

◆ getDefaultFont()

WPS4TextInternal::Font WPS4Text::getDefaultFont ( ) const
protected

returns the default font to use for the document

Referenced by flushExtra(), readFont(), and readText().

◆ getFooterEntry()

WPSEntry WPS4Text::getFooterEntry ( ) const
protected

returns the footer entry (if such entry exists, if not returns an invalid entry)

◆ getHeaderEntry()

WPSEntry WPS4Text::getHeaderEntry ( ) const
protected

returns the header entry (if such entry exists, if not returns an invalid entry)

◆ getMainTextEntry()

WPSEntry WPS4Text::getMainTextEntry ( ) const
protected

returns the main text entry (if such entry exists, if not returns an invalid entry)

◆ mainParser() [1/2]

WPS4Parser & WPS4Text::mainParser ( )
inlineprotected

◆ mainParser() [2/2]

WPS4Parser const & WPS4Text::mainParser ( ) const
inlineprotected

return the main parser

◆ numPages()

int WPS4Text::numPages ( ) const

returns the number of pages

◆ objectDataParser()

bool WPS4Text::objectDataParser ( long bot,
long eot,
int id,
long endPos,
std::string & mess )
protected

reads a object properties ( position in text, size and definition in file)

Referenced by readStructures().

◆ readDosLink()

bool WPS4Text::readDosLink ( WPSEntry const & entry)
protected

reads the ZZDLink ( a list of filename )

Referenced by readStructures().

◆ readEntries()

bool WPS4Text::readEntries ( )
protected

finds all text entries (TEXT, SHdr, SFtr, BTEC, BTEP, FTNp, FTNp, BKMK, FONT, CHRT)

◆ readFont()

bool WPS4Text::readFont ( long endPos,
int & id,
std::string & mess )
protected

reads a font properties

Referenced by readStructures().

◆ readFontNames()

bool WPS4Text::readFontNames ( WPSEntry const & entry)
protected

reads the font names

Referenced by readStructures().

◆ readFootNotes()

bool WPS4Text::readFootNotes ( WPSEntry const & ftnD,
WPSEntry const & ftnP )
protected

reads the footnotes positions and definitions ( zones FTNd and FTNp)

Referenced by readStructures().

◆ readParagraph()

bool WPS4Text::readParagraph ( long endPos,
int & id,
std::string & mess )
protected

reads a paragraph properties

Referenced by readStructures().

◆ readPLC()

bool WPS4Text::readPLC ( WPSEntry const & zone,
std::vector< long > & textPtrs,
std::vector< long > & listValues,
WPS4Text::DataParser parser = nullptr )
protected

reads a PLC (Pointer List Composant ?) in zone entry

Parameters
zonethe zone of the data in the file,
textPtrslists of offset in text zones where properties changes
listValueslists of properties values (filled only if values are simple types: int, ..)
parserthe parser to use to read the values

Referenced by findFDPStructures(), readFootNotes(), and readStructures().

◆ readStructures()

bool WPS4Text::readStructures ( )
protected

parsed all the text entries

◆ readText()

bool WPS4Text::readText ( WPSEntry const & entry)
protected

reads a text section and sends it to a listener

Referenced by flushExtra().

◆ sendObjects()

void WPS4Text::sendObjects ( int page)

send all the objects with page anchor corresponding given page

Parameters
pageif page < 0, sends all the pictures which have a page anchor,

◆ setListener()

void WPS4Text::setListener ( WPSContentListenerPtr & listen)
inline

sets the listener

◆ WPS4Parser

friend class WPS4Parser
friend

Member Data Documentation

◆ m_listener

WPSContentListenerPtr WPS4Text::m_listener
protected

the listener

Referenced by flushExtra(), readText(), setListener(), and WPS4Text().

◆ m_state


The documentation for this class was generated from the following files:

Generated on Sat Jul 19 2025 05:24:40 for libwps by doxygen 1.14.0