Data Definition Dictionary

Here are the list of structures found in the Personal Proxy Statistics database. The naming convention should be accurate and consistant. All structure names should be in upper case, with prefixes and words being seperated by underscores. Primary and Foreign keys have a suffix of ID.

Possible values for the Flags column are:

Please see http://www.postgresql.org/docs/user/datatype.htm for a list of PostgreSQL supported data types. I strongly suggest reading the following section from the PostgreSQL FAQ: 4.7) How much database disk space is required to store data from a typical flat file?. Indexes do not contain as much overhead, but do contain the data that is being indexed, so they can be large also.

Table List
TBL_WU_RC564 Work Unit (blocks) table for Project RC5-64
TBL_WU_DES Work Unit (blocks) table for Project DES
TBL_WU_CSC Work Unit (blocks) table for Project CSC
TBL_WU_OGR Work Unit (nodes) table for Project OGR
TBL_OS Operating Systems type lookup table.
TBL_CPU CPU type lookup table.
TBL_HOST IP to DNS caching table.
TBL_PARTICIPANT Email address and participant information table.
TBL_COUNTRY Country list lookup table.

TBL_WU_RC564
Field Flags Data Type Size Description
SUBMITTED R, I datetime 8 bytes This is the date/time that the work units were submitted. There might be a speed problem using the timestamp data type because it doesn't seem to index very well. This needs to be confirmed or denied as the struture of this table depends on it. The timestamp is this only applicable datatype that supports the SET TIME ZONE 'zone' function. (The type 'timestamp' can not be indexed, using datetime as a workaround.)
HOST_ID F, R, I int2 2 bytes This foreign key relates to table TBL_HOST. It represents the host machine that processed this work unit.
PARTICIPANT_ID F, R, I int2 2 bytes This foreign key relates to table TBL_PARTICIPANT. It represents the email address that processed this work unit.
WU_SIZE R int4 4 bytes The number of work units (blocks) processed. int4 data type is used so that agregate functions such as SUM() return results that don't overflow. Maximum 2,147,483,647 blocks.
OS_ID F, R, I int2 2 bytes This foreign key relates to table TBL_OS. It represents the operating system that processed this work unit.
CPU_ID F, R, I int2 2 bytes This foreign key relates to table TBL_CPU. It represents the CPU that processed this work unit.
VERSION R, I varchar(8) (4+) 8 bytes The client version that processed this work unit.
Totals     (36+) (4+) 32 bytes
72 bytes total
Notice that this value is approximately equal to, or less than, the corresponding size of one row in the proxy log file. When compression algorithms are used to combine SUBMITTED, HOST_ID, PARTICIPANT_ID, CPU_ID, OS_ID, and VERSION, the final size of this table should be significantly less than the combined size of the log files.

TBL_WU_DES has the same layout as TBL_WU_RC564.

TBL_WU_CSC has the same layout as TBL_WU_RC564.

TBL_WU_OGR
Field Flags Data Type Size Description
SUBMITTED R, I datetime 8 bytes This is the date/time that the work units were submitted. (The type 'timestamp' can not be indexed, using datetime as a workaround.)
HOST_ID F, R, I int2 2 bytes This foreign key relates to table TBL_HOST. It represents the host machine that processed this work unit.
PARTICIPANT_ID F, R, I int2 2 bytes This foreign key relates to table TBL_PARTICIPANT. It represents the email address that processed this work unit.
WU_SIZE R int8 8 bytes The number of work units (nodes) processed. int8 data type is used so that agregate functions such as SUM() return results that don't overflow. Maximum +18 decimal points.
OS_ID F, R, I int2 2 bytes This foreign key relates to table TBL_OS. It represents the operating system that processed this work unit.
CPU_ID F, R, I int2 2 bytes This foreign key relates to table TBL_CPU. It represents the CPU that processed this work unit.
VERSION R, I varchar(8) (4+) 8 bytes The client version that processed this work unit.
Totals     (36+) (4+) 36 bytes
76 bytes total
 

TBL_OS
Field Flags Data Type Size Description
OS_ID P int2 2 bytes This is the ID number assigned to this operating system by Distributed.net.
OS R varchar(30) (4+) 30 bytes The descriptive name of this operating system.
OS_ICON R varchar(30) (4+) 30 bytes The icon file name for this operating system.
Totals     (36+) (4+) 70 bytes
110 bytes total
 

TBL_CPU
Field Flags Data Type Size Description
CPU_ID P int2 2 bytes This is the ID number assigned to this CPU type by Distributed.net.
CPU R varchar(30) (4+) 30 bytes The descriptive name of this CPU.
CPU_ICON R varchar(30) (4+) 30 bytes The icon file name for this CPU.
Totals     (36+) (4+) 70 bytes
110 bytes total
 

TBL_HOST
Field Flags Data Type Size Description
HOST_ID P, S int2 2 bytes The interal ID number automatically assigned to this host. (This field is unique and required, however it does not use a sequence generator. New hosts are given the value of MAX(HOST_ID) + 1.)
IP U, R, I varchar(16) (4+) 16 bytes The IP address of this host.
HOSTNAME   varchar(50) (4+) 50 bytes The cached DNS hostname for this IP address. If no hostname can be resolved, then the IP address is inserted into this field.
UPDATED   timestamp 8 bytes The date/time of the last attempted resolution of IP address to hostname. After a user defined length of time has passed, the IP address is resolved again and stored into the HOSTNAME field. Caching hostnames in this manner reduces very costly system calls to gethostbyname().
Totals     (36+) (4+) 84 bytes
124 bytes total
 

TBL_PARTICIPANT
Field Flags Data Type Size Description
PARTICIPANT_ID P, S int2 2 bytes The internal ID number automatically assigned to this participant. (This field is unique and required, however it does not use a sequence generator. New participants are given the value of MAX(PARTICIPANT_ID) + 1.)
EMAIL U, R, I varchar(50) (4+) 50 bytes The email address of this participant.
DNET_ID   varchar(20) (4+) 20 bytes The Distributed.Net ID for this participant. The ID is usually a short 5 or 6 digit number.
FNAME   varchar(30) (4+) 30 bytes The first name of this participant.
LNAME   varchar(30) (4+) 30 bytes The last name of this particpant.
NICKNAME   varchar(30) (4+) 30 bytes The nickname of this participant. If this value is entered into this field, then all appearences of the participants email address in the stats pages will be replaced with it.
DOB   date 4 bytes The date of birth of this participant.
HOMEPAGE   varchar(250) (4+) 250 bytes The URL for this participants home page on the web.
COUNTRY_ID R, I varchar(4) (4+) 4 bytes This foreign key relates to table TBL_COUNTRY. Statistical analsys of work done by domain uses this field for group by calculations. Because this field is required, an attempt to guess the participants country will be made by inspecting the top level domain of the participants email address.
MEMBER R boolean 1 byte This field indicates where or not this email address should be shown in the stats. When marked as FALSE, no blocks will be credited to this email address. A seperate script will be used to mark participants as "blacklisted". This field will be set to FALSE for their record, and a one time only deletion of all stats from the WU tables will occur.
RETIRE_TO_ID   int2 2 bytes This field is set when an email address is retired. All work that is currently credited to this participant is updated to the new participant ID, and all future work is automatically set to the new ID as well. This functionality is provided by ppdb.pl
Totals     (36+) (4+) 451 bytes
491 bytes total
 

TBL_COUNTRY
Field Flags Data Type Size Description
COUNTRY_ID P varchar(4) (4+) 4 bytes Top level domain name of this country. (com, org, uk, etc)
COUNTRY R varchar(50) (4+) 50 bytes The full country name for this domain.
COUNTRY_ICON R varchar(30) (4+) 30 bytes The icon file name for this country.
Totals     (36+) (4+) 96 bytes
136 bytes total
 


$Id: ddd.php3,v 1.5 2000/05/31 21:26:38 kpesce Exp $