This site uses advanced css techniques
We were recently retained by a customer to extract data from an application running Sybase Adaptive Server Anywhere 6 on Windows NT, and this details the steps taken to get the data out in a reasonable time and in a way that the customer could reproduce without our help. The target audience here is the consultant generally skilled in databases and problem solving but doesn't know Sybase all that well. We were in this very same boat until just before this project.
We expect that the approaches taken could apply to other Sybase releases and perhaps to other databases in general.
Our customer in turn had a prospective customer who ran a competitive software system that was built on Sybase. He wanted to extract the data from that system into text files for possible import into his own system for possible sale. He knew nothing about Sybase so found us.
The prospective customer installed a modem on the NT 4.0 Server and enabled pcAnywhere dialin access: we were able to dial in for a couple of hours each evening during the window between end-of-business and when the backup kicked in. This system had no internet access of any kind.
Nobody involved knew anything about the Sybase specifics, including any of the database passwords involved. This was apparently a turnkey application, and we presume that the competitive vendor would not likely be forthcoming with this information. The circumstances were also such that we didn't care to press the prospective customer too much, so we were largely on our own.
By way of conversation, "GoodApp" is our customer's application, and "SadApp" is the competitive one run by the prospective customer.
Sybase documentation for the release we used can be found at http://sybooks.sybase.com/awg0603e.html.
Information about achieving sa access in other versions of Sybase can be found on this web server here.
The Sybase Adaptive Server Anywhere 6 was installed in the directory C:\ASA6 and the directory with the executables was found in the system path. There were quite a few command-line utilities here, including the promising DBUNLOAD.EXE but we were never ultimately able to use it due to what we can only believe were insufficient database permissions.
One important note: Microsoft SQL Server is based on Sybase, and though they have diverged as the products each took on their own lives, there is much that is in common between the two. We were fortunate to leverage our MS-SQL experience when solving this Sybase problem.
Our primary vehicle for working with the database was DBISQL.EXE an interactive SQL interface. This Windows GUI had sub-windows for commands, data, and error output, and it was sufficient to do everything we needed.
When launching DBISQL you'll be asked for connection information via this dialog box:
You'll be given three ways to connect:
Once attached to the database - by whatever method - you can then issue the usual SQL queries that one does when digging around. These are a few of the tidbits we found most helpful. Note that this is not a reference work, and we skip over many details that can be gleaned from the documentation. We just wish to give a starting point.
C> DBUNLOAD -c "uid=SadUser;pwd=sadpass" ...params
Our customer required not only the data, but the layout of the tables themselves so that he could see what the fields all meant. This suggested that even were our DBUNLOAD approach to have worked, we'd still have need to extract the schemas separately.
There were way too many tables (hundreds) to extract by hand, so we instead elected for an automated approach. We used the sp_tables stored procedure to create a text file containing the list of all tables in the system as a base, then used a perl program to create a pair of SQL scripts from that data: The first extracted all table schemas, and the second actually unloaded the data. This approach has proven to be quite promising.
The process started by extracting the full list of tables, and this was done via the DBISQL command:
sp_tables ># C:\UNLOAD\TABLES.TXT
We created the makescripts.p perl program to take this file and create the output scripts, and these scripts can be directly loaded into DBISQL to perform the actual extraction operations. makescripts.p does no database operations of any kind - this is strictly a text-processing application.
We'll study the generated script output first.
The queryschema.sql file produced by makescripts.p contains an sp_columns command for each "interesting" table - defined later - and the output of each is routed to a per-table output file. Each table is referenced also by its "creator" (here, app) to disambiguate tables from various owners.
sp_columns TABLE1,app ># C:/UNLOAD/schemas/app_TABLE1.sql; sp_columns TABLE2,app ># C:/UNLOAD/schemas/app_TABLE2.sql; sp_columns TABLE3,app ># C:/UNLOAD/schemas/app_TABLE3.sql; ...
The querydata.sql script - also produced by makescripts.p - likewise contains the UNLOAD commands required to extract data from each table to a data file:
UNLOAD TABLE app.TABLE1 TO 'C:/UNLOAD/data/app_TABLE1.txt'; UNLOAD TABLE app.TABLE2 TO 'C:/UNLOAD/data/app_TABLE2.txt'; UNLOAD TABLE app.TABLE3 TO 'C:/UNLOAD/data/app_TABLE3.txt'; ...
Each of these scripts can be loaded directly into DBISQL via File:Open, and once the script appears in the command window, clicking Execute runs the script and routes the output to the destinations found in the scripts. Extracting of the schemas moves very quickly, and of course the data takes a bit longer.
We found that db user SadUser did not have permission to select on a few of the tables, and during the process we received a popup dialog box to this effect: clicking "Continue" allowed the process to continue (albeit without those few tables).
This program does the bulk of the preparatory work, and it takes a large number of command-line parameters that help guide its operation.
makescripts.p [options] --dir=DIR tablefile
For most applications, the command-line parameters will be sufficiently large that they warrant a small RUNSQL.BAT batch file to contain them (these should all be on the same long line):
perl -w makescripts.p --verbose --dir=C:/UNLOAD --skip=VIEW --skip=SYS --skip=dbo --skip=archive --skip="SYSTEM TABLE" tables.txt
Then we simply invoke RUN.BAT to perform all our processing, leaving the two output script files in the current directory. These output scripts are then transferred to the target system for processing.
In our sample case, archive was the creator of a set of tables related to the application but didn't appear interesting to us (all its tables were empty).
Once the queryschema.sql script is run, the resulting output schema files are in a very poor format for use by a database developer trying to decide how to import the data. These exported files are all found in the DIR/schemas/ directory, each with the (poor) name creator_tablename.sql — they're not SQL files at all.
We have decoded most of the important fields and have written a second perl program that runs through each file and creates a human-readable version of the schema. fixschemas.p is given a list of *.SQL filenames and creates a set of *.SCH output text files.
C> perl -w fixschemas.p schemas\*.sql
The resulting schema files are suitable for printing or study.
The schema decoding leaves much to be desired, and (in particular) takes no account of indexes, keys, or referential relationships. All of these could conceivably be extracted from the database, but we have not yet found this necessary.
We'll note in particular that we're fuzzy on exactly how the "width" and "precision" fields are decoded, and we very well could be understanding them wrong.
These two perl programs can be found on this web server. Both are stored here as text files to allow for easier downloading.
This code is in the public domain.