CVS Revision $Revision: 1.6 $

Tools

Xpat searching

 
Command Results
1: pr.500 shift.-400 sample.5 Show 500 characters of text for each match; move context to point 400 bytes before match; show a sample of 5 matches.
2: region DOC incl.2 region PLAY Match DOCs which include at least two PLAY elements
3: 2 not incl.3 region PLAY All DOCs with exactly two PLAY elements (initial '2' represents results of previous query)
4: region DOC incl [2844920] The DOC which includes a particular byte offset
5: (region HI within region P) not within region HEADER Nested query: A P containing a HI, which is not within a HEADER.
How to find the title of a document which contains a particular byte offset:
1: region DOC incl [byteoffset]
2: region TITLE within region DOC
3: pr.region.2
See also: Xpat online documentation.


scp (Cygwin)

scp, a command in Cygwin, is a huge timesaver.  It allows command-line copying of files from a server onto a local hard drive and back again.

From the first Cygwin screen, type cd /cygdrive/DRIVENAME (e.g., cd /cygdrive/c).  Using cd, change to the directory to which or from which you wish to copy files.

To get files from server:

scp USERNAME@SERVER:[ABSOLUTE PATH] .

For example:
scp jsmith@server.umich.edu:/dlxs/prep/e/eepf/conversion/0raw/eepf01.xml .

(copies eepf01.xml from server to current local directory)

To put files on server:
scp FILE USERNAME@SERVER:[ABSOLUTE PATH]

For example:
scp eepf01.xml jsmith@server.umich.edu:/dlxs/prep/e/eepf/conversion/0raw/

(Copies eepf01.xml from current local directory to 0raw directory on server.)
 


cat (UNIX)

Joins contents of files.

Usage:

cat FILES > OUTPUTFILE
For example:
cat *.errs > allerrors
joins the contents of all .errs files and combines them into the file allerrors.  When using the * matching operator, make sure that it doesn't also match the name of the output file (the file will be twice as big as it should be).  Also, make sure that you're not matching any files you don't want (e.g., beware that xemacs creates backup files with a tilde suffix).


foreach (UNIX)

foreach file (*.xml)
nsgmls -s -f $file.errors doctype $file
end


Find zero-byte files and delete them

This can be used to delete error files produced by nsgmls.

find . -type f -size 0 -prune -exec rm {} \;