|
libtpc
0.1
Textpressocentral core library
|
Textpresso Central is a platform to perform full text literature searches, view and curate research papers, train and apply machine learning (ML) and text mining (TM) algorithm for semantic analysis and curation purposes. The user is supported in this task by giving him capabilities to select, edit and store lists of papers, sentences, term and categories in order to perform training and mining. The system is designed with the intent to empower the user to perform as many operations on a literature corpus or a particular paper as possible. It uses state-of-the-art software packages and frameworks such as the Unstructured Information Management Architecture (UIMA), Lucene and Wt. The corpus of papers can be build from fulltext articles that are available in PDF or NXML format.
libtpc is the core library of Textpresso Central, and includes functions to convert documents, annotate, index and search them.
To compile libtpc, the following libraries and programs are needed:
cmake version >= 3.5 is required.
To compile and install libtpc, run the following commands from the root directory of the repository:
This will install libtpc in its default location (/usr/local/lib).
libtpc can be also compiled and installed in debug mode, with the following commands:
The users "www-data" and "textpresso" must be created, along with "www-data" and "textpresso" databases. To do so, install postgresql-client, and set "trust" method for local access to all users in /etc/postgresql/9.6/main/pg_hba.conf. Then, run the following commands to create the users:
Enter the psql console by running:
Inside the console, run the following commands to create the databases:
NOTE
If you want to debug libtpc, create a user and a database with your username (the user that will launch the server instance) and grant all privileges on all tables, by running the following command in the psql console launched as user postgres:
The easiest way to populate the database is to restore it from a dump file, with the following command:
A Docker image based on Ubuntu 16.04 and with all the libraries required to compile and run libtpc and the other Textpresso projects (textpressocentral and tpctools) is vailable on Docker hub. To pull it, run the following command:
To run the image and connect to an interactive shell:
libtpc and the other Textpresso projects can be directly compiled and installed on the image.
The repository contains a Dockerfile that can be used to generate a Docker image: Dockerfile-tpc
To build the image, run the following command:
NOTE The image comes with libtpc pre-installed, but the other Textpresso project have to be manually installed from its console.
To connect to a running Docker container, execute the following command:
After installing Textpressocentral, the literature data must be populated. To do so, copy the LuceneIndex folder from an existing Textpressocentral installation. For example, to copy the C. Elegans literature, run the following commands:
1.8.11