Page 1 of 2

[Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 09 May 2019, 18:22
by alexandria
Background (This post)

It kind of feels weird to write about this in such a 'formal' atmosphere I guess. Forums weirdly carry that kind of atmosphere and I don't know why, but I do know that one step towards making it informal is to start using it as an informal space. I was talking to Michi and complaining that I don't really have a space to talk about this project, and he suggested speaking about it here, so, I guess while my brain is working on the debugging for another project, it might be worth to post about it here.

Background (The project)

Sorry if this is a bit disjointed, currently my memory is a bit all-over the place, I'm piecing this together from my git log and memory of how things went, haha.

Basically, a month or two ago I was tossing about stupid programming ideas with a friend. As a pure joke I said "IRC client written in Assembly using only syscalls", we laughed, the conversation moved on.

But. The seed was planted. The entire night, I couldn't sleep. I had this itch of a project putting down it's roots. I already had three other half-finished projects on the go, I didn't need the work of ANOTHER project, in assembler, no less. A day or so later my willpower snapped like a twig.

The project - First Steps

Ok, so, I figured first thing's first, get basic networking sorted. First target, get a simple function, attach that allows me to shove in an IP and port and get a file descriptor out of it. After about half a day of reading up existing documentation of how to do this via C, and then writing a small prototype.
The early test code (ripped from git and retro-commented, lol) is roughly:

Code: Select all

int main(int argc, char *argv[])
{
       struct sockaddr_in a = {0};
       int s = 0;

       char reply[80];                                            /* one line + null byte */
       int replysize = 79;                                        /* one line size */
       memset(reply, 0, replysize+1);

       a.sin_family = AF_INET;                                    /* AF: Address Family; INET: IPv4 */
       a.sin_port = _htons(6667);                                 /* IRC port converted to big endian (network byte order) */
       aton(argv[1], &a.sin_addr.s_addr);                         /* Take the target ip address and convert it to a four byte (int)
                                                                     big endian 'ip address' that linux accepts */

       if ((s = socket(AF_INET, SOCK_STREAM, 0)) < 0) {           /* open an IPv4 stream socket */
               perror("socket"); return 1;
       }
       if ((connect(s, (struct sockaddr *)&a, sizeof(a))) < 0) {  /* 
               perror("connect"); return 1;
       }
       readbuffer(s, reply, replysize);
       printf("reply: %s\n", reply);

       close(s);
       return 0;
}
From the syscall man pages and some 'linux networking tutorials' it seemed clear that networking seemed to resolve around the sockaddr_in structure. Some auxillary functions I haven't included (for brevity) are: aton, a function that takes an IP address and turns it into a big-endian IP address (which is essentially just an integer).

Personally though, I feel that there's a beautiful form of horror in the testing code I used to figure out and recreate aton:

Code: Select all

void test_aton(char *name, char *address)
{
    /* ... */
    struct in_addr addr1 = {0};
    inet_aton(address, &addr1);
    printf("s_addr1: %u\n", (unsigned int)addr1.s_addr);
    printf("%d.%d.%d.%d\n", ((uint8_t*)&(addr1.s_addr))[0],
                            ((uint8_t*)&(addr1.s_addr))[1],
                            ((uint8_t*)&(addr1.s_addr))[2],
                            ((uint8_t*)&(addr1.s_addr))[3]);
    /* ... */
}
Essentially at this level the networking interface is a complicated dance between how we think of the structures, and what linux expects in the structures. For example, the general structure is defined in the POSIX documentation as:

Code: Select all

struct sockaddr *
	sa_family_t  sa_family  Address family.
	char         sa_data[]  Socket address (variable-length data).
which in our specific case ends up to be:

Code: Select all

struct sockaddr_in *
	sa_family_t     sin_family   AF_INET.
	in_port_t       sin_port     Port number.
	struct in_addr  sin_addr     IP address.
With the usual example code telling us to set the IP address by doing something like:

Code: Select all

	struct sockaddr_in blah = {0};
	/* blah blah blah */
	blah.sin_addr.s_addr = <however you get the network order ip address>
except the kernel actually expects

Code: Select all

struct sockaddr *
	unsigned short   sin_family  AF_INET.
	uint16_t         sin_port    Port number.
	uint32_t         sin_addr    IP Address.
	uint64_t         zero        Literally just 8 bytes of nothing.
But I digress!

After writing a sufficient aton (alpha-to-network, I think the original name is?) function in C, I wrote a simpler form of it in C (Because I haven't coded assembly in over a year, ha.) that was basically exactly like the assembly version, and then wrote that in assembly. And plugged it in, and tested it, and it somehow worked?

Then I went and read the IRC spec and realised that it kind of requires domain name -> IP Address resolution. After an internal struggle, and reading up on DNS, I decided that, what the hell, I did commit to doing this via only syscalls, seems kind of a waste if I don't use this opportunity to learn about how exactly DNS messages are resolved (the Musl LibC code that does this, is really disgusting, by the way).

So yay.

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 09 May 2019, 18:49
by Michcioperz
Thanks for writing all that, Alex.

As the formal part goes, I would like to apologize for enforcing this format of writing on thou.

But more importantly, I'm just so glad that my uni curriculum is built in exactly such way that I learnt everything I needed to understand your issues over the course of the last few months - doing syscalls in asm, dealing with implicit expectations of the ABI, writing TCP clients and servers in C.
alexandria wrote:
09 May 2019, 18:22
uint64_t zero Literally just 8 bytes of nothing.
Does that have to be literally zeros or just correct space within the process's virtual memory?
alexandria wrote:
09 May 2019, 18:22
how exactly DNS messages are resolved (the Musl LibC code that does this, is really disgusting, by the way).
I imagine you were as surprised as me and Wolf when you found out about backreferences in DNS messages :D

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 09 May 2019, 19:07
by alexandria
As the formal part goes, I would like to apologize for enforcing this format of writing on thou.
Honestly? No worries! It's worth it to try and change my mental impression of these spaces.
But more importantly, I'm just so glad that my uni curriculum is built in exactly such way that I learnt everything I needed to understand your issues
Treasure that... I know some people who would kill for a uni curriculum that teaches them actual computer science, haha.
Does that have to be literally zeros or just correct space within the process's virtual memory?
Honestly I just realised that the code version doesn't pad out the bytes of that structure :shock:, so it seems linux will accept it regardless and ignore the contents... :roll:
I imagine you were as surprised as me and Wolf when you found out about backreferences in DNS messages :D
Honestly... my opinion on that has cycled from:
  • "this is actually pretty ingenious"
  • "Oh shit this will make parsing a NIGHTMARE"
  • "Oh actually it shouldn't be too bad actually"
  • "nO waIT I SPOKE TOO SOON"

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 09 May 2019, 19:15
by Michcioperz
alexandria wrote:
09 May 2019, 19:07
Treasure that... I know some people who would kill for a uni curriculum that teaches them actual computer science, haha.
I think it is well deserved, I sacrificed many freshman year friends to Maths to get to where I am now.
alexandria wrote:
09 May 2019, 19:07
Honestly I just realised that the code version doesn't pad out the bytes of that structure :shock:, so it seems linux will accept it regardless and ignore the contents... :roll:
I mean, I thought that's what the addrlen argument in rdx is for. Not an expert ofc, I haven't tried to mix up my asm and networking skills.

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 09 May 2019, 19:54
by Wolf480pl
Sounds like a long way to go.
IIRC a DNS client, especially an async one, is hard to write even in C, let alone asm.

Btw. do you need the DNS for anything other than resolving the server you're going to connect to?
On the server side you'd obviously need to look up clients' reverse DNS and all that, but I can't remember anything requiring DNS on the client side...

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 09 May 2019, 20:07
by Michcioperz
If I understood, this is just to resolve the server name initially.

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 10 May 2019, 01:56
by alexandria
Wolf480pl wrote:
09 May 2019, 19:54
Sounds like a long way to go.
IIRC a DNS client, especially an async one, is hard to write even in C, let alone asm.
It looks marginally 'easy'? But there are certain assumptions I can make that others might not be able to :D
For example, as far as I can tell I only have to deal with parsing A and CNAME entries, rather than bothering with the full gamut.
Wolf480pl wrote:
09 May 2019, 19:54
Btw. do you need the DNS for anything other than resolving the server you're going to connect to?
On the server side you'd obviously need to look up clients' reverse DNS and all that, but I can't remember anything requiring DNS on the client side...
Michcioperz wrote:
09 May 2019, 20:07
If I understood, this is just to resolve the server name initially.
Yeah! So, we need to implement DNS Query messages so we can convert "foo.xyz" to "204.22.123.111", I'm hovering between whether or not I really need to write the comparison function (to check if the returned domain is the one we asked for) or if it's really necessary :D

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 10 May 2019, 10:03
by Wolf480pl
Well, if I was writing an IRC client in asm, I'd just make the user provide the server's IP address instead of hostname, at least initially. But that's just me being lazy.

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 11 May 2019, 09:38
by alexandria
Wolf480pl wrote:
10 May 2019, 10:03
Well, if I was writing an IRC client in asm, I'd just make the user provide the server's IP address instead of hostname, at least initially. But that's just me being lazy.
Yeah, that was originally the plan? But IIRC I noticed that the RFC makes domain names necessary? In retrospect it might have been the server RFC, hmm.

Re: [Assembler/C] How I Ended Up Writing A DNS Client and IRC Client In Assembler

Posted: 11 May 2019, 09:40
by Michcioperz
alexandria wrote:
11 May 2019, 09:38
Yeah, that was originally the plan? But IIRC I noticed that the RFC makes domain names necessary? In retrospect it might have been the server RFC, hmm.
Nobody respects the RFCs anyway…