You are on page 1of 30

SciELO Methodology

Methodology
SciELO PC Programs (Windows/Visual Basic/VBA
Word)
Server programs: Title Manager, Code Manager,
Converter/Parser, XML SciELO
Workstation program: Markup/Parser
Located on: c:\scielo\bin\
SciELO Processing (.bat, .sh, java)
GeraPadrao
Programs to export data
Bibliometrics
Etc
Located on: c:\home\scielo\www\proc or c:\scielo\web\proc
SciELO Web (Apache, PHP, WWWISIS)
Located on: c:\home\scielo\www or c:\scielo\web

Computers
Local server
(Windows):
Title Manager
Code Manager
Converter
XML SciELO
Markup/Parser
Local web site
files storage (img,
pdf, html, etc)
1 or more workstations
(Windows)
Markup/Parser
Microsoft Word
Linux server
processing
homologation web site
production web site
Obs.: Each one of
these functions can be
in one or more linux
server
The programs are in: c:\scielo\bin\
The data are in: c:\scielo\serial
The data of each journal are in:
c:\scielo\serial\<journal_acronym>
The data of each journal issue are in:
c:\scielo\serial\acronym\v*n*
Before using the programs, it is necessary to check if all files are in the
correct structure.

Under the volume and number folder, the following directories must be
created:
Files structure in the local server
(Windows)
c:\SciELO\serial\sajs\v105n7-
8\source\editorial.pm6
Body
Contains all articles of
an issue, each article
in its own file, named
in the correct way
Img
Contains all images,
figures, graphics,
etc. named in the
correct way
c:\SciELO\serial\sajs\v105
n7-8\body\a01v29n1.html
c:\SciELO\serial\sajs\v105n7-
8\img\a01fig01.gif
Markup
Contains the articles to
be marked. The files
from the folder body
should be copied and
pasted in this folder.
c:\SciELO\serial\sajs\
v105n7-8\markup\a01v29n1.html

PDF
Contains the PDF
files that must be
named in the same
way as the HTML
files
c:\SciELO\serial\sajs\
v105n7-8\pdf\a01v29n1.pdf

Source
Contains the original files
(final version), without
any sort of last-minute
modifications or
adjustments

Files structure in the local server
(Windows)
Structure of the
journals folders
directory
Files structure in the local server
(Windows)
SciELO PC programs are accessed
by the Program Files menu
Components
SciELO PC Programs
SciELO PC Programs: Local
server
Title Manager:
program in Visual Basic
to manage the database title (journal), section (table of contents of the journals),
issue (issues).
located in the local server: c:\scielo\bin\config
Code Manager:
program in Visual Basic
to manage the tables of codes (language, country, etc) .
located in the local server: c:\scielo\bin\codes
Converter:
program in Visual Basic
to convert the markup documents into database ISIS
located in the local server: c:\scielo\bin\converter
XML SciELO:
Program in BATCH
To generate the XML to export to ISI and PubMed, it can be modified to generate
to other databases
Located in the local server: c:\scielo\xml_scielo
SciELO PC Programs:
workstation
Markup:
program in VBA Word
to guide the identification of the elements of
the article
located in the local server and
WORKSTATION: c:\scielo\bin\markup
SGML Parser:
program in VB and C
to parse the markup of the documents
located in the local server and
WORKSTATION: c:\scielo\bin\sgmlpars

Local SciELO web site
It is a preview of the SciELO web site, usually in Windows, but it is
also possible to use a Linux server
The usual files structure is:
Or C:\scielo\web (earlier versions)
Or C:\home\scielo\www (current versions)
Both Linux and Windows uses same structure below www or web:
htdocs, bases, proc, etc
The differences are the format of ISIS databases (Windows x Linux).
It implies different versions of cisis and wxis
It also implies in using the correct versions of Apache and PHP for
Windows or Linux
The command: c:\scielo\web\proc\GeraPadrao.bat uses the
databases from c:\scielo\serial\* and generate c:\scielo\web\bases\*
(or c:\home\scielo\www\bases\*)

Files structure of web site
Web site files
www/bases
pdf
translation
artigo
issue
title
etc
www/htdocs
img
revistas
www/cgi-bin
Wxis
IsisScript

Processings files
www/proc
www/bases-work

Files
reception
Workflow in the local server
Files
preparation ->
doc/html
.html
Title
Section
Issue
Title
Manager
Code
Manager
Code
newcode
Code
Manager
corrections
Markup
Parser
MS-Word
Issue.mds
Marked files
Converter
issue of a journal
v<VOL>n<NUM>,
Located on serial
Corrections
Local GeraPadrao
scilist
artigo etc
Local web site
Workflow in the local server
PEOPLE receive the files of the issues (.html, images, pdf, etc)
PEOPLE prepare and archive them in serial/<journal_acronym>/v<VOL>n<NUM>/ in folders:
markup, body, img, pdf. One article for one file .html. At this point, markup and body folders have
the same content.
Parallelly the issues data are registered using Title Manager/Create new issue
When an unregistered journal title comes, it must be registered, using Title Manager/Create new
title
After registering the issues data, Title Manager generates input files for markup and converter
programs in their own folder in the computer where Title Manager is running (bin\markup and
bin\convert). So, it is necessary to copy the files from bin\markup to the other computers where
Markup runs
Markup program is used to identify the bibliographic elements of the articles/text located in the
markup folder (serial/<journal_acronym>/v<VOL>n<NUM>/markup)
Parser program is used to validate the files processed by Markup program
Converter program reads the files located in markup and body of na issue
(serial/<journal_acronym>/v<VOL>n<NUM>) and then generates its databas
All the databases generated by Converter program are used by GeraPadrao to create the
database of the web site. The images, pdf, etc, have to be copied to the corresponding folder to be
accessed by the web site.
Code Manager is rarely used. It manages the tables of codes used by SciELO.
Whenever mistakes are found, it is possible go back and correct the data and redo the process
Finally, using a script EnviaBasesScieloPadrao.bat, the databases are sent to a server to be
processed



Workflow: transfering data from local to processing area
issue of a journal
v<VOL>n<NUM>,
Located on serial
Code
newcode
Located on
bases, resulting
of local
GeraPadrao:
Title
newissue
Local GeraPadrao
scilist
EnviaBasesSciELOPadrao.bat
(local): proc/temp/transf2linux
Processing server
FTP
local server
Processing server
1) Configure the files:
C:\scielo\web\proc\transfer\ or
C:\home\scielo\www\proc\transfer
EnviaBasesLogOn.txt
EnviaImgPdfLogOn.txt
EnviaTranslationLogOn.txt


2) Execute in C:\scielo\web\proc\ or c:\home\scielo\www\proc\:
EnviaBasesSciELOPadrao.bat it sends the databases from Windows to Linux
EnviaImgPdfSciELOPadrao.bat it sends the img, pdf from Windows to Linux
EnviaTranslationSciELOPadrao.bat it sends the translations from Windows to Linux

3) Execute
GeraPadrao.bat
Open <server_name>
<user_name>
<password>
cd <path_www>
Configuration of the processing to send data and
files to the processing area
Workflow in the Linux server
issue of a journal
v<VOL>n<NUM>,
Located on serial
Code
newcode
Located on bases,
resulting of local
GeraPadrao:
Title
newissue
GeraPadrao.bat
scilist
databases artigo etc
For the web site
Homologation
Web site
copy
Production/Public
Web site
Workflow in the Linux server
After receiving the databases and files, the
GeraPadrao.bat script must be executed to
generate the databases for the web site. It
is necessary because the databases of the
Windows and Linux have uncompatible
format





Processings in Windows
Xml_scielo: is part of PC Programs/server
PubMed: generate XML to PubMed
ISI: generate XML to Web Of Sciences
EnviaBasesScieloPadrao.bat sends databases to
processing server
EnviaImgPdfScieloPadrao.bat sends img and pdf to
homologation server
EnviaTranslationScieloPadrao.bat sends translations to
homologation server

Note: processing, homologation and public server can be
the same

Only for SciELO Brasil
Health indicators (Brazilian database)
Curriculum ScienTI / Lattes (Brazilian
Database, but it is possible an adaptation)
Semantic highlights: a trial of Knewco,
interrupted.

Centralized processings
SciELO.ORG:
Bibliometrics
Links Medline, LILACS, etc
Co-authors not finished

Centralized in Brazil
doaj: not ready; necessary and agreement
with DOAJ
Accesses

Processings in the instance
scieloUpdate: to update the SciELO web site in Linux
For data exchange:
By sending data
Envia2Medline.bat: feeds scielo.org. Use for: bibliometrics,
etc
Crossref
By letting available to harvest
Google Scholar
Webservices
By querying
Scimago. Query to http://www.scimagojr.com/journalrank.php
databases: related and cited from SciELO.org

In the instance: no processing /
exchanging data
External services:
Google Analytics
OAI

Installation and configuration of
SciELO web site and processings
in a Linux server
http://reddes.bvsalud.org/projects/scielo-
metodologia/browser/tags/v5.0-
pr/docs/SciELO-Web-
5.0_installation_guide_en.pdf

Installation and configuration of the
local SciELO web site
http://webdevcodex.com/tutorial-
installing-apache2-php5-mysql5-
phpmyadmin3-windows-7-vista/
Local SciELO web site
Version 3: (php4.3.x)
http://reddes.bvsalud.org/projects/scielo-
metodologia/browser/branches/scielo-web_3.3
Version 4: (php4.3.x - php5.2.x)
http://reddes.bvsalud.org/projects/scielo-
metodologia/browser/tags/

Using VirtualBox to host a Linux
server in a Windows
Configuring the network:
Bridge
Move or rename the file: /etc/udev/rules.d/70-persistent-net.rules
To get an IP
Maping the database folder in Linux to a folder in Windows, to use the free space in Windows.
The space in VM is limited to 28 Gb.
Linux: /home/scielo/www/bases => Windows: c:\scielo\bases\linux\public
Linux: /home/scielo_homolog/www/bases => Windows: c:\scielo\bases\linux\homolog\

VirtualBox:
Settings: shared folders
add share:
C:\scielo\bases\linux\public
bases
add share:
C:\scielo\bases\linux\homolog
bases_homolog
/etc/fstab:
bases -> /home/scielo/www/bases
bases_homolog -> /home/scielo_homolog/www/bases

Edit /etc/hosts, including the IP and server
name:



Example:
127.0.0.1 vm.scielo.br
NOTE: 127.0.0.1 DOES NOT CHANGE
Configuration in the server (Linux)
127.0.0.1 servername
Configuration of the Virtual Host
Edit:
Ubuntu: /etc/apache2/sites-available/scielo
Windows: <Apache_path>/conf/extra/httpd-vhost.conf








Blue: scielolocal@domain.org = the e-mail of the web site administrator
Red: /home/scielo = path of the web site
Green: localscielo = url of the web site
New Virtual Host
Create the file for the virtual host in:
/etc/apache2/sites-
available/<name_of_the_virtual_host_file>
Example: /etc/apache2/sites-
available/scielo_homolog
Copy the scielo virtual host file and edit it to
change the configuration as shown in the
previous slide:
Path
Server name
After changing the configuration of apache
/ virtual hosts, you MUST execute:
sudo /etc/init.d/apache2 reload
Configuration in the computers
which access the SciELO websites
- Editing the file: Hosts
Edit C:\windows\system32\drivers\etc\hosts, adding the
line:
127.0.0.1 localscielo
<IP_homolog> homologscielo

Where:
<IP_homolog> = IP of homologation SciELO web site
homologscielo = server name of homologation
SciELO web site
Do it for all the computers which have to access
localscielo and homologscielo

You might also like