Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts
6

What's the difference between these 2 TCP streams?

Hello fellow packet sleuths;

I am in a predicament writing firewall software that's supposed to middleman an HTTP connection.

Here is the TCP stream of the non-middleman'd connection: https://i.imgur.com/slD6Ke2.png

Here is the TCP stream of the middleman'd connection: https://i.imgur.com/33rDBJo.png

My problem begins once an HTTP request is made across more then 1 packet. Once a request is split across 2 or more packets it never seems to get recreated on the initiating side properly.

The unexpected behavior begins after the 404 page is sent back. Instead of acknowledging receipt and terminating the connection, the client sends a "TCP Spurious Retransmission" of supposedly non-acknowledged data.

The data that is retransmitted is equal to the 2nd packets length offset from the start of the request. Meaning for some reason the client thinks only the first couple hundred bytes have been received, and re-transmits the 1460 it thinks is missing from the end of the request.

So if packet 67 has an HTTP payload length of 400 bytes, the retransmit will contain information starting from payload byte 400 in packet 66 and go all the way up through information sent in packet 67, totaling 1460 bytes.

What can be causing this re transmit of already acknowledged information? Why doesn't the middleman'd connection acknowledge receipt and terminate the connection like the non-middleman'd one?

I've spent too many hours banging my head on this, any assistance would be greatly appreciated.

32 comments
74% Upvoted
What are your thoughts? Log in or Sign uplog insign up
level 1
Freelance Network Coder2 points · 3 months ago

Where was the second (middleman'd) wireshark capture taken?

(server)---------(middleman)---------(client)
            ^                   ^
           here (A)?           or here (B)?
level 2
Original Poster1 point · 3 months ago

They are both taken from the client side (B).

level 1

What does the [REASSEMBLY ERROR] part say on the first red packet? Wireshark can be pretty good at telling you whats going wrong.

Looks like the fragmentation isn't working properly at first glance.

level 2
Original Poster1 point · 3 months ago

The exact message is: [Reassembly error, protocol TCP: New fragment overlaps old data (retransmission?)]

level 1

Is the last fragment being returned with MF set to 0 (as it should?), not 1?

The size of the second packet being different (assuming all other factors are equal) is interesting (515 on the working, 521 on the not working).

level 2
Original Poster1 point · 3 months ago

All of the packets have the DF bit set.

The size is based on a random request, it isn't a relevant factor.

I almost thought for a second the answer could be a layer up but it doesn't appear so.

level 3

MF != DF

Though if the DF bit is set, and the MiTM box isn't handling an upstream need for fragmentation, that could be something.

level 4
Original Poster1 point · 3 months ago

Can you elaborate what an "upstream need for fragmentation" would entail?

level 5
Client (1500MTU) --- (1500MTU) LAN (1500MTU) --- (1500MTU) Router (1478 ADSL MTU (eg) --- WAN

Normally on receiving a 1500 byte packet with the DF bit set, the router will send an ICMP message back to the client to inform the client the packet is too large, and because the router can't fragment it for you (due to DF bit), the client needs to reduce transmission size.

You probably should capture either side of the MITM box, and include ICMP messages in your filter.

Also check the MF bit is being set properly. MF=1 on all fragments, except the last which should set MF=0.

level 6
Original Poster1 point · 3 months ago

Wouldnt the non mitm connection see this too? The mitm routing is all done locally and wont have this problem.

level 7

I don't know, there's no diagram or schematic in your post, if the MITM box in transparently, routed in, proxy configured?

If it works without the MITM box, then the MITM box is either breaking something, or not dealing with other communications that allow the connection to be successful.

It may be that there's ICMP control messages between the client and the router that are healing up something, and when you insert the MITM box it doesn't forward them.

Dunno.

Capture both sides is probably the only option, and don't just filter for SRC/DST/port 80... as there may be other components at play.

level 8
Original Poster0 points · 3 months ago

So I've now confirmed 2 things:

- There is no other traffic at play besides the already shown TCP streams, that is all we are dealing with here.

- The problem is not related to any IPv4 header field.

This means the problem is somewhere within the 2 screenshots above, if one of them with direct routing works, and another with middlebox intervention fails, but the streams are identical, what is the problem?

level 9

- The problem is not related to any IPv4 header field.

This means the problem is somewhere within the 2 screenshots above, if one of them with direct routing works, and another with middlebox intervention fails, but the streams are identical, what is the problem?

With all due respect, you came here asking a question and you're not giving enough information for us to help you. We can't possibly be sure that your assertion that "the problem is not related to the IPv4 header" is correct, since you're not sharing information about the IPv4 headers in your screenshots.

What exactly is in that 1514 byte packet? Is the full HTTP request formed of those 2 packets (1514+521 ?).

It seems to me that the middle box doesn't pass / doesn't like the 766 byte response (404 not found), so it keeps resending the 1514 byte packet.

There's multiple reasons for this: corrupt checksums, problems with LRO/TSO on the middlebox, checksum errors, a policy that drops it at a lower level, etc.

The solution here is to take the same packet capture concurrently on the middlebox and figure out what your client is sending VS what the middlebox thinks it's receiving.

level 10
Original Poster1 point · 3 months ago

Unless the problem somehow lies within the Differentiated Services Field, I can say with pretty high confidence it is unrelated to IPv4.

Yes the full request is split across 2 packets, one being MTU and the other being some amount of hundreds of bytes.

Keep in mind these captures were taken from the client side, the middle box passes the 404 response as it should, but the client doesn't like it for some reason and retransmits the request.

What information from the middlebox would help troubleshoot why the client is discarding these packets instead of acknowledging them?

level 1

Is the return traffic being forced back through your mitm or is it going directly back to the client?

level 2
Original Poster1 point · 3 months ago

Both in and outbound traffic pass through the middle box.

level 1

this may be a dumb question but is the ACK coming to the client formatted properly such that the client associates that ACK with the correct stream? If the client is resending the data it sounds like the ACK either wasn't formatted properly or the client did not actually receive it. As far as that goes is the middleman NAT'ing any of this traffic? There's very little structure to go off here so a lot of assumptions are being made.

level 2
Original Poster1 point · 3 months ago

The ACK is formatted properly and the client does actually receive it.

The middlebox is not NAT'ing any of this traffic, but that shouldn't matter when it is the client that doesn't like the acknowledgements it is receiving and the client retransmitting for some reason.

level 3

Was just getting at the scenario if the middleman was NAT'ing traffic and the ACK from the server was addressed to the NAT'ed address, and that was passed to the client then the client would not relate that to the original traffic. Was just a thought.

level 1

This problem is nearly impossible to puzzle through with those two screenshots from Wireshark.

It looks to me like the ACK in frame 69 is 4 bytes short, but the HTTP summary in frames 67 and 70 obscures TCP values (SYN,ACK,LEN) that would be needed to pinpoint the problem.

Put pcaps somewhere if you want real analysis to happen. You can obfuscate the src/dst IPs in the pcap if you need to.

level 2
Original Poster1 point · 3 months ago

What other information would you like?

The ACK value in frame 69 is correct (It is the next sequence value of frame 67).

You can assume the obscured seq / ack numbers are the correct values that they should be.

level 3

What other information would you like?

I'd like to make it into a time/sequence graph.

level 1
Freelance Network Coder1 point · 3 months ago

I don't have an explanation for the Seq number in #79, but observe this: packet #70 is a 766 byte packet from the Server to the Client. So that's (766 - 20 - 20 - 14 = 712) bytes of payload.

Now, since the Server's Sequence Number was 324830945 at #69, we can say that after sending these 712 bytes, the next expected sequence number would be (324830945 + 712 = 324831657).

In other words, the Ack number in #79 ought to be 324831657. That is not what we see, however. What we see in #79 is an Ack value of 324830945.

This means that the Client's TCP stack definitely did not receive #70.

As a matter of fact, 324830945 is the seq that the client expects right after the three way handshake, and it never moves forward.

I have a feeling that the middleman's stack is somehow interfering with packets sent towards the Client in such a way that the Client's TCP stack never sees those packets *after* the 3-way handshake finishes.

You have obscured the IP addresses, so we don't know if they are correct. Also since Wireshark is not reporting any packet format errors, things like the checksum and similar packet formatting errors can be ruled out. I would examine the MAC addresses and other L2 fields if any to see why the Client's TCP stack is not seeing the packets from the Server.

BTW, look at #91. The Seq number here is the original HTTP request's first sequence number (681125146), i.e. it is less than the Seq number in #79.

level 2
Original Poster1 point · 3 months ago

This means that the Client's TCP stack definitely did not receive #70.

Welcome to my predicament. Although as you can see in the screenshot it definitely does receive this packet, instead of acknowledging such it does the retransmit.

Middlebox interference is practically irrelevant in these screenshots as they are taken client side, who cares what's going on at the middlebox and end device, look at the stream of the client side connection where it interacts completely and entirely how it does with the non-middlebox stream up until it breaks for unexplainable reasons.

level 3
Freelance Network Coder3 points · 3 months ago

The fact that Wireshark at the client side sees the packet does not necessarily mean that the Client's TCP stack received the packet. For example if the Destination MAC address was corrupted, Wireshark would still see the packet (if wireshark was started in promiscuous mode) but the TCP stack wouldn't.

level 4
Original Poster1 point · 3 months ago

If somehow the packets got lost from wireshark to application then it would make sense, but I really do not think this is the case. Layer 2 is not the problem.

Remember when we're retransmitting after getting the 404 page, we do acknowledge the first part of the request was received. We aren't completely retransmitting the entire thing. Surely this means the data is in fact reaching the application and just being treated unexpectedly.

level 5
Freelance Network Coder1 point · 3 months ago

We aren't completely retransmitting the entire thing

Actually we are. See the sequence number on packets 91, 107, 131 and 203. All carry the sequence number 681125146, which is the same sequence number on packet 66 (the original HTTP GET request). This tells us that the Client believes that the server did not see anything that it (the Client) sent.

This can be completely explained if the Client did not receive any packets at all, apart from the the initial SYN-ACK (64). The only thing I cannot explain is the seq number on the *first* spurious retransmission. The *subsequent* spurious retransmissions do contain the correct sequence number. This is probably some aspect of TCP back-off/retransmit that I am not aware of, and hopefully some expert can clarify this.

Please take a real close look at the L2 header as well as the IPv4 header (e.g. wrong or swapped destination/source IP addresses) on all the Server->Client packets (e.g. 70).

level 6
Original Poster1 point · 3 months ago

You are correct that the behavior described in my post only happens for the first retransmission, the subsequent re transmissions contain the full request body from the start. But if it was working correctly there wouldn't be any retransmissions to begin with.

The problem is nothing trivial like an incorrect MAC or IP, the fields are as they should be.

Community Details

131k

Subscribers

938

Online

###Enterprise Networking Routers, switches and firewalls. Network blogs, news and network management articles. Cisco, Juniper, Brocade and more all welcome.

Create Post
r/networking Rules
1.
Rule #1: No Home Networking.
2.
Rule #2: No Certification Brain Dumps / Cheating.
3.
Rule #3: No BlogSpam / Traffic re-direction.
4.
Rule #4: No Low Quality Posts.
5.
Rule #5: No Early Career Advice.
6.
Rule #6: Educational Questions must show effort.
Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.