Internet Programming with Windows - A Tutorial

Part 1

By Gandalf Gandalf@dhamma.org.uk (c) Feb 2000

 

Contents

Introduction
What You Will Need
Compiler Comments
The Internet
Protocols
The Windows TCP/IP Stack
Our First Program Using TCP/IP - WHOIS.C
Conclusion and What’s Next


Introduction

I decided to write this tutorial mainly as a means of getting my head around programming Internet applications under the Microsoft Windows OS. This is a topic that is not always easy to understand. There are many web sites, references and books on the subject, but it takes a lot of hard work to plough through them all to sift the good stuff from the junk.

Let me say one thing from the start. I don't like Windows, never have and I suspect I never will. I've been a Unix and Linux user (Minix before Linux came out) now for longer than I care to remember. Linux is probably the best and easiest OS to use for developing Internet applications. There are lots of applications and programs available freely for Linux, most with freely available source code. Whenever I've wanted to develop a specific tool under Linux, I've done what probably most Linux users do...download pieces of code and patch in whatever else is needed to perform the required task. It’s pretty easy to do without necessarily understanding the underlying principles. Yeah, things go wrong, but a bit of tweaking usually gets you there in the end.

So why did I decide to do this under Windows?

Good question.

I started to use Windows more and more since Win95 came out as my profession called for me to keep up with understanding Windows as well as Unix. When I found myself having to develop Windows applications, I immediately started using Visual Basic so I could code easily rather than have to learn Windows programming with C++. I've been a C programmer for about 15 years, but mainly on Unix. I've never used C++ much before, and until recently, I thought it was a pre-requisite for programming under Windows (fool I hear you say).

The main problem I've got with programming under Windows is that most freely available source code is written to use the GUI. It always seems that 90% of the code goes towards "making things pretty" etc. To a non-Windows guru, this can be very daunting, as it's almost impossible for a person not used to C++ to decipher what the hell is going on.

It came to head recently when I wanted to write an application to allow me to retrieve multiple emails from multiple POP servers into one big email spool. I wanted to be able to be read the spool via a single mail editor, and have the editor fill my reply email headers automatically.

So I decided to get to grips with both C programming for Windows and also accessing the functions within the winsock.dll. After that, I decided to dig deeper and delve into coding direct without using the Winsock dll.

Anyway, enough of the ramble. This tutorial should grow on a weekly basis (it'll take me a week or so to write each section as I decipher my own notes I wrote whilst exploring this) and will cover Internet programming using mainly C routines for Windows.

The plan is to start simply by accessing the Winsock dll for all initial programs. At each stage, a brief explanation and useful working source code will be presented to explain various concepts. All programs have been written as command line programs. No GUI, so the code is easier to understand.

The tutorial will build up to a much more advanced level, resulting in the development of a complete new Winsock dll with a lot more callable routines that the hacker and phreaker will like!

Some assembly language programming will be required for later stages of the tutorial, but I'll be releasing these routines as a dll for anyone who doesn't want to learn assembly, but still wants to access the routines.

All of the way through the tutorial, explanations of the underlying concepts of internet programming will hopefully be built up to an advanced level. For example, from the outset various Internet protocols will be described very briefly. Further on in the tutorial, each of the protocols will be explored to a byte level.


What You Will Need

A basic understanding of C.
A C compiler for windows.
Suitable Windows libraries (mainly Winsock and ICMP).
Windows 95/98 or NT.
The desire to learn.

For those who have never programmed in C before (and I'm not talking about for windows), then you will need to download one of the many C programmers tutorials/guides/references that are freely available on the net. Ignore any that are windows related, as I will be including a brief chapter on using C to program Windows apps. If you can understand Basic or any other programming language, C shouldn't be much of a problem.

C compilers are available free for download. For learning basic C then I would recommend Borland's TurboC. They have released version 1 (I believe) for free download, although it won't be suitable for Windows apps.

I use the free GCC package with RSXNTDJ (also free) for developing Windows apps. These are downloadable from www.delorie.com. The main reason I use this is that it is a port of the Linux GCC package, so I'm familiar with it.

Alternatively, you can buy, beg, steal or borrow a commercial copy of whatever compiler you want.

Windows libraries can be downloaded from Microsoft’s SDK pages (although I couldn't find a one for ICMP). For the GCC package you don't even need the libraries as you can convert the equivalent dll's into .a libraries using the makelib tool supplied.

All the routines presented in this tutorial have been tested under NT4 Service Pack 5. I make no guarantees they will work with other win32 OS's although they should. If you run into problems, then email me and I'll try to come up with a solution.


Compiler Comments

Unfortunately I will have to restrict comments in this section to the use of GCCW32 and RSXNTDJ as I don’t have access to any other win 32 compilers. If anyone wants to give feedback on using other compilers, then please email me and I’ll include it in the updates (along with your email address so you can get the why doesn’t this work questions)

With these packages, you can’t use the standard Lib’s that come with the Microsoft SDK, however there is a MAKELIB utility program that can convert MS DLL’s into a .a library.

According to the specs, the header files included with RSXNTDJ state that WINSOCK is supported up to and including version 2.0, however, I could not get programs to compile with NT4.0 SP5’s wsock32.dll after I converted it using MAKELIB. You can set up RSXNTDJ and GCCW32 to use the MS SDK headers, but as the SDK is a hefty download I have not tried this. I will eventually get round to trying it and will update this section.

Until then, I am using an older version of Winsock that was left under my system32 directory after an update and had been renamed as wsock32n.dll. This appears to be version 1.1, but works OK.

You will need to convert both wsock32n.dll and icmp.dll into .a libraries using the MAKELIB utility.

Unfortunately, the help file that comes with RSXNTDJ is wrong, so here’s the protocol:-

Copy wsock32.dll and icmp.dll into your rsxntdj/lib directory, then from a command prompt type the following: -

makelib wsock32n.dll –o libwsock.a

And also: -

makelib icmp.dll –o libicmp.a

In your makefile for gccw32, reference these as –lwsock and –licmp

E.g. for the first program that we will compile (whois.c), the makefile looks like this: -

PROJECT = whois
OBJS = whois.o
LIBS = -lwsock
include ..\..\rsxntmak.gnu

(Make sure the path for rsxntmak.gnu is correct for your system)


The Internet

I thought a very brief introduction to the Internet and some basic terminology might be a good idea to jog people’s memory on certain points, and maybe clear up a few misconceptions. This is general, the following sections will relate specifically as to how Windows handles Internet access. It all starts very basic, but by the end of the tutorial it will get pretty hairy.

There are billions of computers connected to the Internet, some via dial up when needed, and others, which have a permanent "live" connection.

To connect to the computer you want to talk to, you need some way of identifying this computer. It will have a unique "name" and also an IP address, which is a 32-bit number.

The "name" will never change unless the user decides to connect to the internet via a different dial up account, but the IP address may do, especially for most dial up ISP’s

"Names" are specified by a convention called DNS, which stands for Domain Name Service. RFC’s 1032, 1033, 1034 and 1035 cover DNS for those who want to delve deeper.

OK ... let’s take a practical example.

Let’s take my account I use to post to PHUK as an example.

The unique "name" is gandalf.freeuk.com

The IP number changes every time I dial up the account.

So how are these all related?

We’ll start with the name. A DNS name has the following format: -

subdomain.subdomain.domain

There can be many of the subdomain bits. In my case, com is the domain, freeuk is a subdomain and gandalf is the subdomain within freeuk.

The IP Address

We will discuss the IP protocol later on in this section, but for now, it is sufficient to say that IP stands for Internet Protocol and the IP address is the basis for identifying all computers on the internet (and indeed often on private networks). An IP address consists of four octets that define a unique address e.g. 176.116.32.6

When I connect to freeuk.com and successfully login, their server assigns an IP address to my computer. This is different every time I login. It is a dynamic IP address, i.e. it only relates to my computer for the duration of my connection to the Internet.

Some ISP’s issue static IP’s, where even if you aren’t connected the same IP address is reserved for your computer.

The handling of DNS name and equivalent IP address is done by the ISP’s Nameservers.

Part Two of the tutorial will dig deeper into IP address classes and subnets etc.


Protocols

There are quite a few protocols that need to be understood to get to grips with writing your own Internet applications. For this part of the tutorial, we don’t need to know the protocols in depth. But I’ll briefly describe some of them here. Part Two of the tutorial will discuss all the protocols to a byte by byte level for those who really want to know how things work.

PPP

When you use dial up networking to connect to your ISP, Windows uses the PPP protocol to talk to the ISP’s computer. PPP stands for Point to Point Protocol. It handles negotiation of baud rates etc and also gets an IP address from the ISP.

IP

The Internet Protocol is responsible for transporting data packets across the Internet and is primarily used by the routers to give them enough information to get the data packets to their destination. Sometimes IP may fragment data into smaller chunks, where each chunk may arrive at the destination in any order and by different network routes. It is the responsibility of IP to reassemble everything back into the correct order.

TCP

IP by itself is not a very reliable protocol as it sends info out and doesn’t ever check to see if it gets to the destination. This is usually the job for TCP, which stands for Transmission Control Protocol. Any Internet application that requires reliable data transfer (e.g. HTTP, POP, SMTP etc) uses TCP to handshake the data transfer. TCP runs on top of IP

UDP

UDP is also a transmission protocol called the User Datagram Protocol. Unlike TCP it doesn’t use handshaking, so it is much faster but not as reliable. It is often used for streaming audio or video over the Internet. Like TCP, UDP runs on top of IP

ICMP

ICMP stands for Internet Control Message Protocol. It is closely related to IP (in fact it is usually encapsulated within the IP data packets). It is used by programs such as PING etc to check if parts of the Internet are up and running. It is not often used for much else (from the Windows user perspective), but it is used extensively by all of the routers on the Internet.

That’s all we’ll say about protocols at the moment until part two of the tutorial. We’ll present a few programs at the end of this tutorial, which will demonstrate the use of windows socket programming using TCP/IP and also ICMP.


The Windows TCP/IP Stack

The Windows TCP/IP stack (the Winsock dll) takes all the hard work out of programming Internet applications. It handles lower level access for protocols such as TCP, UDP and IP. A separate dll (icmp.dll) handles the ICMP protocol.

We will introduce all gently to programming via the various Windows dll’s.

This just about reaches the end of the first part of the tutorial. Hey says all, where’s the code and examples?

Okay, the first part of the tutorial (the download you are reading now) is a basic introduction as to why I decided to write this tutorial. It hopefully gives a basic explanation of various protocols that will be explored in continuing parts of the tutorial. Without these, people who want to learn from scratch will be out of their depth.

So, I can’t leave this without giving a bit of code for all to try.

The following program provides a template for TCP/IP access using C.

Yes, if you want to do other things, you will have to understand the relevant RFC’s and will also need to use a few more routines from the Winsock dll.

What we have presented so far should give you a basic framework to allow you to layout your own programs.

Don’t worry if you don’t want to, the second tutorial is all about implementing routines from the Winsock dll. It has plenty of examples covering TCP/UDP and ICMP.


Our First Program Using TCP/IP - WHOIS.C

Whois will be familiar to Linux users. If you want to find out information about a certain DNS name, you can run a whois query to find out information on network address numbers, administrative contact details and valid name server’s etc. Official whois servers exist all over the world. A couple of the most useful ones are whois.internic.net (for com, net, org and edu domains) and whois.nic.uk (for co.uk, org.uk, ltd.uk, net.uk etc).

The program accepts one or two command line arguments: -

Whois domainname whois.server and saves any query results to a file called whoislog.txt.

So, for example, if I wanted to query whois.internic.net about freeuk.com, I would use: -

Whois freeuk.com whois.internic.net

If the second argument is missing, the program automatically uses whois.internic.net.

Here is the output for a query on freeuk.com: -

Whois Session Started Sat Feb 26 23:52:33 2000.

Connecting to whois.internic.net

Domain names in the .com, .net, and .org domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

Domain Name: FREEUK.COM
Registrar: NETWORK SOLUTIONS, INC.
Whois Server: whois.networksolutions.com
Referral URL: www.networksolutions.com
Name Server: NS0.FREEUK.NET
Name Server: NS1.FREEUK.NET
Name Server: NS2.CLARA.NET
Updated Date: 15-sep-1999


>>> Last update of whois database: Sat, 26 Feb 00 02:54:30 EST <<<

The Registry database contains ONLY .COM, .NET, .ORG, .EDU domains and
Registrars.

So what happens if I want to query my own freeuk account:-

Whois gandalf.freeuk.com

Result:-

Whois Session Started Sun Feb 27 17:33:41 2000.

Connecting to whois.internic.net

Domain names in the .com, .net, and .org domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

No match for "GANDALF.FREEUK.COM".

>>> Last update of whois database: Sun, 27 Feb 00 02:36:32 EST <<<

The Registry database contains ONLY .COM, .NET, .ORG, .EDU domains and
Registrars.

So why didn’t it find my account?

Well Internic only handles registered domains and not subdomains. We will find out in part two of the tutorial how to query freeuk to find out details on subdomain accounts.

Anyway, here is the program listing. There is a download section at the end of this tutorial with links to download the source, libraries, makefile etc for this and all subsequent programs in this tutorial.

/*
 * whois.c
 *
 * (c) Feb 2000 by Gandalf
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <windows.h>
#include <winsock.h>

#define WIN32_LEAN_AND_MEAN        /* define win 32 only */

void handle_error(void);           /* Error handler routine */
void write_file(char *buf);        /* Write details to log file */

int main (int argc, char **argv)
{
  WORD wVersionRequested;          /* socket dll version info */
  WSADATA wsaData;                 /* data for socket lib initialisation */
  SOCKET sock;                     /* socket details */
  const int BUF_LEN=10000;         /* Buffer size for transfers */
  struct sockaddr_in address;      /* socket address stuff */
  struct hostent * host;           /* host stuff */
  int err;                         /* error trapping */
  float socklib_ver;               /* socket dll version */
  char File_Buf[BUF_LEN];          /* file buffer */
  char DomainName[100];            /* domain name from user */
  char HostName[100];              /* host name from user */
  time_t now;                      /* for date and time */

  if (argc < 2)                    /* check we have command line options */
  {
    printf("\nUseage: whois domainname [whois.server]\n");
    exit(0);
  }

  strcat(DomainName, argv[1]); /* get domain name from command line */
  strcat(DomainName, "\r\n");  /* add crlf as whois servers expect it */

  if (argc == 3)
    strcat(HostName, argv[2]); /* get host name from command line */
  else
    strcat(HostName, "whois.internic.net");

   wVersionRequested = MAKEWORD( 1, 1 );

   /*
    * We need to call the WSAStartup routine BEFORE we try to use any of
    * the Winsock dll calls.
    */

   if ( WSAStartup( wVersionRequested, &wsaData ) != 0 )
     handle_error();

   /* Check socket DLL supports 1.1 or higher */

   socklib_ver = HIBYTE( wsaData.wVersion ) / 10.0;
   socklib_ver += LOBYTE( wsaData.wVersion );

   if ( socklib_ver < 1.1 )
   {
     printf ("\nError: socket library must support 1.1 or greater.\n");
     WSACleanup(); /* clean up before exit */
     exit(0);
   }

   /* write current date and time to log file and screen */

   time(&now);
   sprintf(File_Buf, "Whois Session Started %.24s.\n\n", ctime(&now));
   write_file(File_Buf);

  /*
   * Open a socket. The AF_INET parameter tells windows we want to use the
   * internet. Other parameters for different networking can be chosen e.g.
   * for netbios, IPX etc. The SOCK_STREAM parameter lets windows know we want
   * to use TCP rather than UDP, and the final parameter will always be
   * zero for what we want to do and tells windows to use whatever
   * default communication protocol has been established (eg PPP and IP)
   */

   if ( (sock = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET )
     handle_error();

   /* We now need to initialise a couple of variables in the address
    * structure. Once again, to tell windows we are using the internet,
    * and also what port we want to use when connecting to a remote
    * computer. In this case it is port 43 which is the standard port for
    * whois. The htons routine is used to convert the way Intel chips
    * store data in memory, which is different compared to many other computers.
    * The standard is based on Motorola format.
    */

   address.sin_family=AF_INET;       /* internet */
   address.sin_port = htons(43);     /* port 43 for whois */

   /* write to the log file and screen */

   sprintf(File_Buf,"Connecting to %s\n", HostName);
   write_file(File_Buf);

   /*
    * host is a pointer to a structure of the predefined type hostent. We
    * need to call gethostbyname with the DNS name we want to use to return
    * a pointer to a hostent structure. This is so we can resolve an IP
    * address from our ISP's nameserver.
    */

    if ( (host=gethostbyname(HostName)) == NULL )
      handle_error();

    /* we then initialise the address structure with the resolved IP address */

    address.sin_addr.s_addr=*((unsigned long *) host->h_addr);

    /* Now we're ready to actually connect to the whois server itself */

    if ( (connect(sock,(struct sockaddr *) &address, sizeof(address))) != 0)
      handle_error();

    /*
     * We should be connected to the whois server at this point
     * so we need to send the domain name and wait for the response. The send
     * and recv routines are always used with TCP. These enable handshaking
     * compared to the sendto and recvfrom routines which are used for UDP
     * protocol, ie without handshaking.
     */

    strcpy(File_Buf, DomainName);
    err=send(sock,File_Buf,strlen(File_Buf),0); /* send domain name */
    err=recv(sock,File_Buf,BUF_LEN,0);          /* discard first response */
    err=recv(sock,File_Buf,BUF_LEN,0);          /* get query results back */
    write_file(File_Buf);

    /* Always call WSACleanup before exiting */

    WSACleanup(); /* clean up before exit */
    exit(0);
  }

  void handle_error(void)
  {
    /*
     * Errors are handled by calling the WSAGetLastError routine which
     * will return the last error as one of the following. As we develop
     * this tutorial, we will go into much more detail on what they mean
     * and what caused them.
     */

    switch ( WSAGetLastError() )
    {
      case WSANOTINITIALISED :
        printf("Unable to initialise socket.\n");
      break;
      case WSAEAFNOSUPPORT :
        printf("The specified address family is not supported.\n");
      break;
      case WSAEADDRNOTAVAIL :
        printf("Specified address is not available from the local machine.\n");
      break;
      case WSAECONNREFUSED :
        printf("The attempt to connect was forcefully rejected.\n");
        break;
      case WSAEDESTADDRREQ :
        printf("address destination address is required.\n");
      break;
      case WSAEFAULT :
        printf("The namelen argument is incorrect.\n");
      break;
      case WSAEINVAL :
        printf("The socket is not already bound to an address.\n");
      break;
      case WSAEISCONN :
        printf("The socket is already connected.\n");
      break;
      case WSAEADDRINUSE :
        printf("The specified address is already in use.\n");
      break;
      case WSAEMFILE :
        printf("No more file descriptors are available.\n");
      break;
      case WSAENOBUFS :
        printf("No buffer space available. The socket cannot be created.\n");
      break;
      case WSAEPROTONOSUPPORT :
        printf("The specified protocol is not supported.\n");
        break;
      case WSAEPROTOTYPE :
        printf("The specified protocol is the wrong type for this socket.\n");
      break;
      case WSAENETUNREACH :
        printf("The network can't be reached from this host at this time.\n");
      break;
      case WSAENOTSOCK :
         printf("The descriptor is not a socket.\n");
      break;
      case WSAETIMEDOUT :
        printf("Attempt timed out without establishing a connection.\n");
      break;
      case WSAESOCKTNOSUPPORT :
         printf("Socket type is not supported in this address family.\n");
      break;
      case WSAENETDOWN :
        printf("Network subsystem failure.\n");
      break;
      case WSAHOST_NOT_FOUND :
        printf("Authoritative Answer Host not found.\n");
      break;
      case WSATRY_AGAIN :
        printf("Non-Authoritative Host not found or SERVERFAIL.\n");
       break;
      case WSANO_RECOVERY :
         printf("Non recoverable errors, FORMERR, REFUSED, NOTIMP.\n");
      break;
      case WSANO_DATA :
        printf("Valid name, no data record of requested type.\n");
      break;
        case WSAEINPROGRESS :
        printf("address blocking Windows Sockets operation is in progress.\n");
      break;
      case WSAEINTR :
        printf("The (blocking) call was canceled via WSACancelBlockingCall().\n");
      break;
      default :
        printf("Unknown error.\n");
       break;
  }

  WSACleanup();
  exit(0);
}

void write_file(char *buf)
{
  /* writes results to a log file and also to the screen */

  FILE *fp=fopen("whoislog.txt","a+");
  fprintf(fp,"%s\n",buf);
  fclose(fp);
  printf("%s\n",buf);
}

So...that was our first program using TCP/IP. It is a basic template for virtually any program you will want to do using TCP/IP. All you need to do is change any of the relevant details such as domain name, port, data to send depending on what type of server you are trying to connect to. For example, if you wanted to connect to a POP server, then you would change the domain name to the name of the POP server (e.g. relay.freeuk.net). You would also change the port number to 110, and once a valid connection has been established, change the send and recv part of the program to negotiate collection of your email. For virtually all recognised data transfer protocols such as POP/SMTP etc, there is an RFC that covers the required send and receive string formats. For POP it is RFC1725 and for SMTP it is RFC1869. These are freely available on the web.


Conclusion and What’s Next

Okay...Nothing difficult so far.

We have discussed a few protocols, and we have introduced a simple program that can be compiled and used under windows to do a domain name enquiry.

The basic framework won’t change for the next tutorial.

Part Two of the tutorial will illustrate further programs that access the Winsock dll. We will also introduce accessing the ICMP dll.

Anyway, enough for now. Check back in a couple of weeks for part two of the tutorial.

Send any feedback, errors, typos and wants to gandalf@dhamma.org.uk.

 

The Rota

BlueCrab Ltd