xapian-core
1.4.24
|
This class provides read/write access to a database. More...
#include <database.h>
Public Member Functions | |
virtual | ~WritableDatabase () |
Destroy this handle on the database. | |
WritableDatabase () | |
Create a WritableDatabase with no subdatabases. | |
WritableDatabase (const std::string &path, int flags=0, int block_size=0) | |
Open a database for update, automatically determining the database backend to use. | |
WritableDatabase (const WritableDatabase &other) | |
Copying is allowed. | |
void | operator= (const WritableDatabase &other) |
Assignment is allowed. | |
void | add_database (const WritableDatabase &other) |
Add shards from another WritableDatabase. | |
void | commit () |
Commit any pending modifications made to the database. | |
void | flush () |
Pre-1.1.0 name for commit(). | |
void | begin_transaction (bool flushed=true) |
Begin a transaction. | |
void | commit_transaction () |
Complete the transaction currently in progress. | |
void | cancel_transaction () |
Abort the transaction currently in progress, discarding the pending modifications made to the database. | |
Xapian::docid | add_document (const Xapian::Document &document) |
Add a new document to the database. | |
void | delete_document (Xapian::docid did) |
Delete a document from the database. | |
void | delete_document (const std::string &unique_term) |
Delete any documents indexed by a term from the database. | |
void | replace_document (Xapian::docid did, const Xapian::Document &document) |
Replace a given document in the database. | |
Xapian::docid | replace_document (const std::string &unique_term, const Xapian::Document &document) |
Replace any documents matching a term. | |
void | add_spelling (const std::string &word, Xapian::termcount freqinc=1) const |
Add a word to the spelling dictionary. | |
void | remove_spelling (const std::string &word, Xapian::termcount freqdec=1) const |
Remove a word from the spelling dictionary. | |
void | add_synonym (const std::string &term, const std::string &synonym) const |
Add a synonym for a term. | |
void | remove_synonym (const std::string &term, const std::string &synonym) const |
Remove a synonym for a term. | |
void | clear_synonyms (const std::string &term) const |
Remove all synonyms for a term. | |
void | set_metadata (const std::string &key, const std::string &metadata) |
Set the user-specified metadata associated with a given key. | |
std::string | get_description () const |
Return a string describing this object. | |
Public Member Functions inherited from Xapian::Database | |
void | add_database (const Database &database) |
Add an existing database (or group of databases) to those accessed by this object. | |
size_t | size () const |
Return number of shards in this Database object. | |
Database () | |
Create a Database with no databases in. | |
Database (const std::string &path, int flags=0) | |
Open a Database, automatically determining the database backend to use. | |
Database (int fd, int flags=0) | |
Open a single-file Database. | |
virtual | ~Database () |
Destroy this handle on the database. | |
Database (const Database &other) | |
Copying is allowed. | |
void | operator= (const Database &other) |
Assignment is allowed. | |
bool | reopen () |
Re-open the database. | |
virtual void | close () |
Close the database. | |
PostingIterator | postlist_begin (const std::string &tname) const |
An iterator pointing to the start of the postlist for a given term. | |
PostingIterator | postlist_end (const std::string &) const |
Corresponding end iterator to postlist_begin(). | |
TermIterator | termlist_begin (Xapian::docid did) const |
An iterator pointing to the start of the termlist for a given document. | |
TermIterator | termlist_end (Xapian::docid) const |
Corresponding end iterator to termlist_begin(). | |
bool | has_positions () const |
Does this database have any positional information? | |
PositionIterator | positionlist_begin (Xapian::docid did, const std::string &tname) const |
An iterator pointing to the start of the position list for a given term in a given document. | |
PositionIterator | positionlist_end (Xapian::docid, const std::string &) const |
Corresponding end iterator to positionlist_begin(). | |
TermIterator | allterms_begin (const std::string &prefix=std::string()) const |
An iterator which runs across all terms with a given prefix. | |
TermIterator | allterms_end (const std::string &=std::string()) const |
Corresponding end iterator to allterms_begin(prefix). | |
Xapian::doccount | get_doccount () const |
Get the number of documents in the database. | |
Xapian::docid | get_lastdocid () const |
Get the highest document id which has been used in the database. | |
Xapian::doclength | get_avlength () const |
Get the average length of the documents in the database. | |
double | get_average_length () const |
New name for get_avlength(). | |
Xapian::totallength | get_total_length () const |
Get the total length of all the documents in the database. | |
Xapian::doccount | get_termfreq (const std::string &tname) const |
Get the number of documents in the database indexed by a given term. | |
bool | term_exists (const std::string &tname) const |
Check if a given term exists in the database. | |
Xapian::termcount | get_collection_freq (const std::string &tname) const |
Return the total number of occurrences of the given term. | |
Xapian::doccount | get_value_freq (Xapian::valueno slot) const |
Return the frequency of a given value slot. | |
std::string | get_value_lower_bound (Xapian::valueno slot) const |
Get a lower bound on the values stored in the given value slot. | |
std::string | get_value_upper_bound (Xapian::valueno slot) const |
Get an upper bound on the values stored in the given value slot. | |
Xapian::termcount | get_doclength_lower_bound () const |
Get a lower bound on the length of a document in this DB. | |
Xapian::termcount | get_doclength_upper_bound () const |
Get an upper bound on the length of a document in this DB. | |
Xapian::termcount | get_wdf_upper_bound (const std::string &term) const |
Get an upper bound on the wdf of term term. | |
ValueIterator | valuestream_begin (Xapian::valueno slot) const |
Return an iterator over the value in slot slot for each document. | |
ValueIterator | valuestream_end (Xapian::valueno) const |
Return end iterator corresponding to valuestream_begin(). | |
Xapian::termcount | get_doclength (Xapian::docid did) const |
Get the length of a document. | |
Xapian::termcount | get_unique_terms (Xapian::docid did) const |
Get the number of unique terms in document. | |
void | keep_alive () |
Send a "keep-alive" to remote databases to stop them timing out. | |
Xapian::Document | get_document (Xapian::docid did) const |
Get a document from the database, given its document id. | |
Xapian::Document | get_document (Xapian::docid did, unsigned flags) const |
Get a document from the database, given its document id. | |
std::string | get_spelling_suggestion (const std::string &word, unsigned max_edit_distance=2) const |
Suggest a spelling correction. | |
Xapian::TermIterator | spellings_begin () const |
An iterator which returns all the spelling correction targets. | |
Xapian::TermIterator | spellings_end () const |
Corresponding end iterator to spellings_begin(). | |
Xapian::TermIterator | synonyms_begin (const std::string &term) const |
An iterator which returns all the synonyms for a given term. | |
Xapian::TermIterator | synonyms_end (const std::string &) const |
Corresponding end iterator to synonyms_begin(term). | |
Xapian::TermIterator | synonym_keys_begin (const std::string &prefix=std::string()) const |
An iterator which returns all terms which have synonyms. | |
Xapian::TermIterator | synonym_keys_end (const std::string &=std::string()) const |
Corresponding end iterator to synonym_keys_begin(prefix). | |
std::string | get_metadata (const std::string &key) const |
Get the user-specified metadata associated with a given key. | |
Xapian::TermIterator | metadata_keys_begin (const std::string &prefix=std::string()) const |
An iterator which returns all user-specified metadata keys. | |
Xapian::TermIterator | metadata_keys_end (const std::string &=std::string()) const |
Corresponding end iterator to metadata_keys_begin(). | |
std::string | get_uuid () const |
Get a UUID for the database. | |
bool | locked () const |
Test if this database is currently locked for writing. | |
Xapian::rev | get_revision () const |
Get the revision of the database. | |
void | compact (const std::string &output, unsigned flags=0, int block_size=0) |
Produce a compact version of this database. | |
void | compact (int fd, unsigned flags=0, int block_size=0) |
Produce a compact version of this database. | |
void | compact (const std::string &output, unsigned flags, int block_size, Xapian::Compactor &compactor) |
Produce a compact version of this database. | |
void | compact (int fd, unsigned flags, int block_size, Xapian::Compactor &compactor) |
Produce a compact version of this database. | |
Additional Inherited Members | |
Static Public Member Functions inherited from Xapian::Database | |
static size_t | check (const std::string &path, int opts=0, std::ostream *out=NULL) |
Check the integrity of a database or database table. | |
static size_t | check (int fd, int opts=0, std::ostream *out=NULL) |
Check the integrity of a single file database. | |
This class provides read/write access to a database.
|
virtual |
Destroy this handle on the database.
If no other handles to this database remain, the database will be closed.
If a transaction is active cancel_transaction() will be implicitly called; if no transaction is active commit() will be implicitly called, but any exception will be swallowed (because throwing exceptions in C++ destructors is problematic). If you aren't using transactions and want to know about any failure to commit changes, call commit() explicitly before the destructor gets called.
Xapian::WritableDatabase::WritableDatabase | ( | ) |
Create a WritableDatabase with no subdatabases.
The created object isn't very useful in this state - it's intended as a placeholder value.
|
explicit |
Open a database for update, automatically determining the database backend to use.
If the database is to be created, Xapian will try to create the directory indicated by path if it doesn't already exist (but only the leaf directory, not recursively).
path | directory that the database is stored in. |
flags | one of:
|
Additionally, the following flags can be combined with action using bitwise-or (| in C++):
block_size | If a new database is created, this specifies the block size (in bytes) for backends which have such a concept. For chert and glass, the block size must be a power of 2 between 2048 and 65536 (inclusive), and the default (also used if an invalid value is passed) is 8192 bytes. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
Xapian::DatabaseLockError | will be thrown if a lock couldn't be acquired on the database. |
Xapian::WritableDatabase::WritableDatabase | ( | const WritableDatabase & | other | ) |
Copying is allowed.
The internals are reference counted, so copying is cheap.
other | The object to copy. |
|
inline |
Add shards from another WritableDatabase.
Any shards in other are added to the list of shards in this object. The shards are reference counted and also remain in other.
other | Another WritableDatabase object to add shards from |
Xapian::docid Xapian::WritableDatabase::add_document | ( | const Xapian::Document & | document | ) |
Add a new document to the database.
This method adds the specified document to the database, returning a newly allocated document ID. Automatically allocated document IDs come from a per-database monotonically increasing counter, so IDs from deleted documents won't be reused.
If you want to specify the document ID to be used, you should call replace_document() instead.
Note that changes to the database won't be immediately committed to disk; see commit() for more details.
As with all database modification operations, the effect is atomic: the document will either be fully added, or the document fails to be added and an exception is thrown (possibly at a later time when commit() is called or the database is closed).
document | The new document to be added. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
void Xapian::WritableDatabase::add_spelling | ( | const std::string & | word, |
Xapian::termcount | freqinc = 1 |
||
) | const |
Add a word to the spelling dictionary.
If the word is already present, its frequency is increased.
word | The word to add. |
freqinc | How much to increase its frequency by (default 1). |
void Xapian::WritableDatabase::add_synonym | ( | const std::string & | term, |
const std::string & | synonym | ||
) | const |
Add a synonym for a term.
term | The term to add a synonym for. |
synonym | The synonym to add. If this is already a synonym for term, then no action is taken. |
void Xapian::WritableDatabase::begin_transaction | ( | bool | flushed = true | ) |
Begin a transaction.
In Xapian a transaction is a group of modifications to the database which are linked such that either all will be applied simultaneously or none will be applied at all. Even in the case of a power failure, this characteristic should be preserved (as long as the filesystem isn't corrupted, etc).
However, note that if called on a sharded database, atomicity isn't guaranteed between shards. Within each shard, the transaction will still act atomically.
A transaction is started with begin_transaction() and can either be committed by calling commit_transaction() or aborted by calling cancel_transaction().
By default, a transaction implicitly calls commit() before and after so that the modifications stand and fall without affecting modifications before or after.
The downside of these implicit calls to commit() is that small transactions can harm indexing performance in the same way that explicitly calling commit() frequently can.
If you're applying atomic groups of changes and only wish to ensure that each group is either applied or not applied, then you can prevent the automatic commit() before and after the transaction by starting the transaction with begin_transaction(false). However, if cancel_transaction is called (or if commit_transaction isn't called before the WritableDatabase object is destroyed) then any changes which were pending before the transaction began will also be discarded.
Transactions aren't currently supported by the InMemory backend.
flushed | Is this a flushed transaction? By default transactions are "flushed", which means that committing a transaction will ensure those changes are permanently written to the database. By contrast, unflushed transactions only ensure that changes within the transaction are either all applied or all aren't. |
Xapian::UnimplementedError | will be thrown if transactions are not available for this database type. |
Xapian::InvalidOperationError | will be thrown if this is called at an invalid time, such as when a transaction is already in progress. |
void Xapian::WritableDatabase::cancel_transaction | ( | ) |
Abort the transaction currently in progress, discarding the pending modifications made to the database.
If an error occurs in this method, an exception will be thrown, but the transaction will be cancelled anyway.
Xapian::DatabaseError | will be thrown if a problem occurs while modifying the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
Xapian::InvalidOperationError | will be thrown if a transaction is not currently in progress. |
Xapian::UnimplementedError | will be thrown if transactions are not available for this database type. |
void Xapian::WritableDatabase::clear_synonyms | ( | const std::string & | term | ) | const |
Remove all synonyms for a term.
term | The term to remove all synonyms for. If the term has no synonyms, no action is taken. |
void Xapian::WritableDatabase::commit | ( | ) |
Commit any pending modifications made to the database.
For efficiency reasons, when performing multiple updates to a database it is best (indeed, almost essential) to make as many modifications as memory will permit in a single pass through the database. To ensure this, Xapian batches up modifications.
This method may be called at any time to commit any pending modifications to the database.
If any of the modifications fail, an exception will be thrown and the database will be left in a state in which each separate addition, replacement or deletion operation has either been fully performed or not performed at all: it is then up to the application to work out which operations need to be repeated.
However, note that if called on a sharded database, atomicity isn't guaranteed between shards - it's possible for the changes to one shard to be committed but changes to another shard to fail.
It's not valid to call commit() within a transaction.
Beware of calling commit() too frequently: this will make indexing take much longer.
Note that commit() need not be called explicitly: it will be called automatically when the database is closed, or when a sufficient number of modifications have been made. By default, this is every 10000 documents added, deleted, or modified. This value is rather conservative, and if you have a machine with plenty of memory, you can improve indexing throughput dramatically by setting XAPIAN_FLUSH_THRESHOLD in the environment to a larger value.
This method was new in Xapian 1.1.0 - in earlier versions it was called flush().
Xapian::DatabaseError | will be thrown if a problem occurs while modifying the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
void Xapian::WritableDatabase::commit_transaction | ( | ) |
Complete the transaction currently in progress.
If this method completes successfully and this is a flushed transaction, all the database modifications made during the transaction will have been committed to the database.
If an error occurs, an exception will be thrown, and none of the modifications made to the database during the transaction will have been applied to the database.
In all cases the transaction will no longer be in progress.
Note that if called on a sharded database, atomicity isn't guaranteed between shards. Within each shard, the transaction will still act atomically.
Xapian::DatabaseError | will be thrown if a problem occurs while modifying the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
Xapian::InvalidOperationError | will be thrown if a transaction is not currently in progress. |
Xapian::UnimplementedError | will be thrown if transactions are not available for this database type. |
void Xapian::WritableDatabase::delete_document | ( | const std::string & | unique_term | ) |
Delete any documents indexed by a term from the database.
This method removes any documents indexed by the specified term from the database.
A major use is for convenience when UIDs from another system are mapped to terms in Xapian, although this method has other uses (for example, you could add a "deletion date" term to documents at index time and use this method to delete all documents due for deletion on a particular date).
unique_term | The term to remove references to. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
void Xapian::WritableDatabase::delete_document | ( | Xapian::docid | did | ) |
Delete a document from the database.
This method removes the document with the specified document ID from the database.
Note that changes to the database won't be immediately committed to disk; see commit() for more details.
As with all database modification operations, the effect is atomic: the document will either be fully removed, or the document fails to be removed and an exception is thrown (possibly at a later time when commit() is called or the database is closed).
did | The document ID of the document to be removed. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
|
virtual |
Return a string describing this object.
Reimplemented from Xapian::Database.
void Xapian::WritableDatabase::operator= | ( | const WritableDatabase & | other | ) |
Assignment is allowed.
The internals are reference counted, so assignment is cheap.
Note that only an WritableDatabase may be assigned to an WritableDatabase: an attempt to assign a Database is caught at compile-time.
other | The object to copy. |
void Xapian::WritableDatabase::remove_spelling | ( | const std::string & | word, |
Xapian::termcount | freqdec = 1 |
||
) | const |
Remove a word from the spelling dictionary.
The word's frequency is decreased, and if would become zero or less then the word is removed completely.
word | The word to remove. |
freqdec | How much to decrease its frequency by (default 1). |
void Xapian::WritableDatabase::remove_synonym | ( | const std::string & | term, |
const std::string & | synonym | ||
) | const |
Remove a synonym for a term.
term | The term to remove a synonym for. |
synonym | The synonym to remove. If this isn't currently a synonym for term, then no action is taken. |
Xapian::docid Xapian::WritableDatabase::replace_document | ( | const std::string & | unique_term, |
const Xapian::Document & | document | ||
) |
Replace any documents matching a term.
This method replaces any documents indexed by the specified term with the specified document. If any documents are indexed by the term, the lowest document ID will be used for the document, otherwise a new document ID will be generated as for add_document.
One common use is to allow UIDs from another system to easily be mapped to terms in Xapian. Note that this method doesn't automatically add unique_term as a term, so you'll need to call document.add_term(unique_term) first when using replace_document() in this way.
Note that changes to the database won't be immediately committed to disk; see commit() for more details.
As with all database modification operations, the effect is atomic: the document(s) will either be fully replaced, or the document(s) fail to be replaced and an exception is thrown (possibly at a later time when commit() is called or the database is closed).
unique_term | The "unique" term. |
document | The new document. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
void Xapian::WritableDatabase::replace_document | ( | Xapian::docid | did, |
const Xapian::Document & | document | ||
) |
Replace a given document in the database.
This method replaces the document with the specified document ID. If document ID did isn't currently used, the document will be added with document ID did.
The monotonic counter used for automatically allocating document IDs is increased so that the next automatically allocated document ID will be did + 1. Be aware that if you use this method to specify a high document ID for a new document, and also use WritableDatabase::add_document(), Xapian may get to a state where this counter wraps around and will be unable to automatically allocate document IDs!
Note that changes to the database won't be immediately committed to disk; see commit() for more details.
As with all database modification operations, the effect is atomic: the document will either be fully replaced, or the document fails to be replaced and an exception is thrown (possibly at a later time when commit() is called or the database is closed).
did | The document ID of the document to be replaced. |
document | The new document. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
void Xapian::WritableDatabase::set_metadata | ( | const std::string & | key, |
const std::string & | metadata | ||
) |
Set the user-specified metadata associated with a given key.
This method sets the metadata value associated with a given key. If there is already a metadata value stored in the database with the same key, the old value is replaced. If you want to delete an existing item of metadata, just set its value to the empty string.
User-specified metadata allows you to store arbitrary information in the form of (key, value) pairs.
There's no hard limit on the number of metadata items, or the size of the metadata values. Metadata keys have a limited length, which depend on the backend. We recommend limiting them to 200 bytes. Empty keys are not valid, and specifying one will cause an exception.
Metadata modifications are committed to disk in the same way as modifications to the documents in the database are: i.e., modifications are atomic, and won't be committed to disk immediately (see commit() for more details). This allows metadata to be used to link databases with versioned external resources by storing the appropriate version number in a metadata item.
You can also use the metadata to store arbitrary extra information associated with terms, documents, or postings by encoding the termname and/or document id into the metadata key.
key | The key of the metadata item to set. |
metadata | The value of the metadata item to set. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
Xapian::InvalidArgumentError | will be thrown if the key supplied is empty. |
Xapian::UnimplementedError | will be thrown if the database backend in use doesn't support user-specified metadata. |