libtpc  0.1
Textpressocentral core library
Static Public Member Functions | List of all members
tpc::cas::CASManager Class Reference

Static Public Member Functions

static void convert_raw_file_to_cas1 (const std::string &file_path, FileType type, const std::string &out_dir, bool use_parent_dir_as_outname=false)
 
static int convert_cas1_to_cas2 (const std::string &file_path, const std::string &out_dir)
 
static BibInfo get_bib_info_from_xml_text (const std::string &xml_text)
 
static std::vector< std::string > classify_article_into_corpora_from_bib_file (const BibInfo &bib_info)
 

Member Function Documentation

std::vector< std::string > CASManager::classify_article_into_corpora_from_bib_file ( const BibInfo bib_info)
static

get the list of corpora to which the article belongs through classification performed on bibliographic information

Parameters
bib_infoobject containing the bibliographic information of the article
int CASManager::convert_cas1_to_cas2 ( const std::string &  file_path,
const std::string &  out_dir 
)
static

convert a cas1 file to cas2 format and save it to the specified location

Parameters
file_paththe path to the cas1 file
out_dirthe location where to save the new cas file
Returns
1 if the conversion succeded, 0 otherwise
void CASManager::convert_raw_file_to_cas1 ( const std::string &  file_path,
FileType  type,
const std::string &  out_dir,
bool  use_parent_dir_as_outname = false 
)
static

convert a pdf or xml article to cas1 format and save it to the specified location

Parameters
file_paththe path to the raw file
typethe type of file
out_dirthe location where to save the new cas file
BibInfo CASManager::get_bib_info_from_xml_text ( const std::string &  xml_text)
static

extract bib information from the xml fulltext of an article

Parameters
xml_texta string containing the xml fulltext of an article
Returns
the bib info of the xml article

The documentation for this class was generated from the following files: