Data Organizaiton-Types-Conversions

Objectives:

  1. Organize your data set into a format that can be used by visual data analysis tools,
  2. Understand some of the more basic data types,
  3. Learn how to convert the different data types most often encountered.

Because of limited time we provide you with data sets that are already organized for the particular tools you will use in the various sections through out this document. But before you can use these graphical tools with your own data you must learn how to organize and construct your own data sets. This also requires a knowledge of different data types: binary, ASCII, HDF, netCDF, PICT,PICS,GIF, MPEG, etc.

Organization of Data for Visual Data Analysis Tools

There are almost an infinite number of ways to organize your data but here we keep it simple for starters and use a Fortran program to create a data set that can be used by a number of different visualization tools. This Fortran program creates an ASCII data file fan_64.ascii. This data file is organized as follows: data must be scaled from 0 (minimum) to 255 (maximum), results are written to the ASCII file where each three digit integer value is seperated by a leading space and each line contains only 20 integers for viewing convenience. The sequence of 3-D data written to the ASCII file is accomplished by including the four statements shown in the schematic fortran program below.

Here we show that you only need to add 4 lines to your existing code to create a data set ready for visualization

Fortran Schematic Program: 

                           program name        
              1 -->        integer image(64,64,62)
              2 -->        open(6,file='fan_64.ascii',status='unknown',err=88)

	                           --- scale densities to integers  ---
                                   minimum = 0  and maximum = 255
     
              3 -->        write(6,10)(((image(i,j,k),i=1,64),j=1,64),k=1,62)
              4 -->     10 format(20(1x,i3))
                        88 stop
                           end

The first 14 lines of the fan_64.ascii file are listed below.

     fan_64.ascii:

     12  12  12  12  11  11  12  12  12  11  11  12  12  11  12  11  10  11  12  12
     12  12  11  12  12  12  12  12  12  11  11  11  13  12  12  12  12  12  11  12
     11  12  12  12  12  12  12  12  12  11  11  11  12  12  12  12  12  12  11  11
     11  12  12  13  12  12  12  11  12  11  12  11  12  12  11  12  12  12  12  11
     12  12  12  12  11  12  12  12  12  11  12  12  12  12  12  11  12  12  12  12
     12  12  12  12  11  12  12  12  13  12  12  11  11  11  11  12  12  12  12  12
     11  12  12  12  12  13  12  12  11  12  11  11  11  11  11  11  12  12  12  11
     11  12  12  12  12  12  12  13  12  12  12  12  11  12  12  12  12  12  11  11
     11  11  12  11  11  12  12  11  12  13  13  12  12  12  12  11  11  12  12  11
     12  12  12  12  12  13  12  11  12  12  12  12  11  11  11  11  11  11  12  12
     11  11  11  12  11  12  11  12  12  11  12  12  12  12  11  11  12  12  11  11
     12  12  11  12  12  12  12  11  12  13  12  12  11  11  12  12  12  12  11  11
     11  12  12  12  12  12  11  12  11  13  12  12  12  13  12  12  11  11  11  11
     12  12  12  12  12  11  12  12  13  12  11  12  11  11  11  12  12  12  11  12

Data Types: just a quick review

For a more complete study of data type and structures as it relates to data visualization we suggest that you read "The Data Handbook" by Brand Fortner.

Binary:   Machine readable form              
               " byte  * " :  8 - bit integer,  0 => 255 "unsigned character"
               " short * " :  16-bit integer
               " long  * " :  32-bit integer
                       * unsigned binary-integers

ASCII:    Programmer readable form
                   (American national Standard Code for Information Interchange)

                       " integer " :  0, 1, 2, 3, ........ -> upper limit depends on hardware
                " floating point " :   0.0, 1.0,  123.4,  0.1234E+3

HDF and netCDF:  Special data compression format for graphical data sets (binary)

                HDF:  Hierarchical Data Format is a multi-object file format for the transfer    
                      of graphical and floating-point data between computers that was
                      created at the National Center for Supercomputer Applications
                      (NCSA)  address:  NCSA Documentation Orders
                                        152 Computing Applications Bldg.
                                        605 East Springfield Ave.
                                        Champaign, IL 61820
                                        or: Anonymous ftp:  ftp.ncsa.uiuc.edu

           netCDF:  network Common Data Format is another similar to HDF but
                    sponsored by the University Corporation for Atmospheric
                    Research (UCAR) address:  netCDF support
                                              Unidata Program Center
                                              P.O. Box 3000
                                              Boulder, Colorado 80307
                                              or: Anonymous ftp:  unidata.ucar.edu
PICT and PICS:  Images created on Macintosh
         PICT:  Image data created on Macintosh
        PICTS:  A sequence of PICT images stored in one file and typically used
                by animation software packages.

Here we have listed only a few of the more popular data types that you will encounter. For a more extensive discussion the reader can find this information in a variety of introductory computer science text books or get help from your on-site computer service organization. Typically data types are machine dependent and hence sources of this information appear fragmented.

Conversion of Data Types: Emphasis on HDF

Typically data is converted into binary format simply because most 3-D data sets are so large that ASCII, although readable by the programmer, exceeds memory limits of most computer workstations. Also for the same reason binary data files are created on your mainframe computers or on large memory computer workstations. We have included a simple set of C (atoi.c) and Fortran77 (m_sds_hdf.f) programs, listed below that will convert an ASCII data set into binary file and the same ASCII data set into an HDF file respectively. We also show below how to convert these file types on a UNIX workstation which has sufficient memory.


Create binary file  "fan_64.bin"  from ASCII file "fan_64.ascii"

          % cc atoi.c -o atoi.x
          % cat fan_64.ascii | atoi.x > fan_64.bin

Create HDF file  "fan_64.hdf"  from the same ASCII "fan_64.ascii"

          % f77 m_sds_hdf.f -o m_sds_hdf.x -ldf
          % m_sds_hdf.x

The programs used to convert data types are listed on the next page. These HDF and binary files will be used through out this course to get more detailed information on HDF you can go to the NCSA HDF web page from which HDF documentation was downloaded and included here: HDF 4.1r2 Users Guide / HDF 4.1r2 Reference Guide / HDF Examples. These documents were downloaded from the HDF Documentation site

Fortran program for data conversion: ASCII to HDF

      program m_sds_hdf
c
c  creates an HDF file from an ASCII integer file
c  for more information on subroutines and HDF
c  contact the NCSA at University of Illinois at
c     address:  NCSA Documentation Orders
c                   152 Computing Applications Bldg.
c                   605 East Springfield Ave.
c                   Champaign, IL 61820
c                   Phone: (217) 244-0072
c
c       Anonymous ftp file server name: ftp.ncsa.uiuc.edu
c
      integer image(64,64,62)
      real val(64,64,62) 
      integer shape(3),ret 
      integer DFSDsetdims
      integer DFSDputdata
c
      open(5,file='fan_64.ascii',status='old',err=88)
c
      shape(1)=64
      shape(2)=64
      shape(3)=62
c
c  read in integer values from input file
c
      ix=shape(1)
      iy=shape(2)
      iz=shape(3)
c      
      read(5,10)(((image(i,j,k),i=1,ix),j=1,iy),k=1,iz)
   10 format(20(1x,i3))
c
c change mode from integer array (image)
c               to floating point array (val)
c
      do 20 k=1,iz
      do 20 j=1,iy
      do 20 i=1,ix 
      val(i,j,k)=image(i,j,k)
   20 continue
c
c  write val to an HDF file
c
      ret=DFSDsetdims(3,shape) 
      ret=DFSDputdata('fan_64_sds.hdf',3,shape,val)
c      
      if (ret .ne. 0)then
      write(*,*)'Error Writing HDF File'
      endif
c  
   88 stop
      end

C-program for data conversion: ASCII to unsigned 8bit (1-byte) binary

/*  atoi.c - converts ascii integers between 0 and 255 to
                  binary integers of type unsigned char (1-byte)  */
 
#include
main()
{
int	in, flag = 0, accum = 0;
while ((in = getchar()) != EOF)
	{
	if (in >= 0x30 && in <= 0x39) 
	    {in -= 0x30; accum *= 10; accum += in; flag = 1;}
	else
	    {if (flag == 1)
		{putchar(accum); flag = 0; accum = 0;}
	    }
	}
if (flag == 1) putchar(accum);
}

C-program for data conversion: unsigned 8bit (1-byte) binary to ASCII

/* itoa.c - converts binary integers of type 
		      unsigned char (1-byte)
		      to ascii integer */

#include
main()
{
int in, j = 0;
while ((in = getchar()) != EOF)
	{
	printf(" %3d",in);
	if (j++ == 19)
		{
		printf("\n");
		j = 0;
		}
	}
}


Click image to return to Visualization home page.

R.D. Kriz
Va. Tech
College of Engineering
Revised 02/11/95

http://www.sv.vt.edu/classes/ESM4714/Gen_Prin/data_org/data_org.html