Louis' imperfect blog

How I do (type-safe) container types in C – type-safe(r) container types

How I do (type-safe) container types in C

Recently, after seeing two articles on how to achieve container-types in C, I decided I'd also write one.

Daniel Hooper's article | lobste.rs thread

Why am I not satisfied with these two articles

The only correct reason is that I suffer from the not-invented-here syndrome, but I have some complaints that would make me not want to use these implementations.

Uecker's way

#define vec(T) struct vec_##T { ssize_t N; T data[/* .N */]; }
-- Martin Uecker, Generic Containers in C: vec

This is how I started doing "generics" in C and I quickly ran into issues with having the macro define the name of the Vec.

For "simple types" this works great, vec(int) would expand to:

struct vec_int { ssize_t N, int data[] }

But for more complex types this would break down pretty quickly.

struct MyValue {int a, int b}

vec(struct MyValue)

Would expand to:

struct vec_struct MyValue { ssize_t N, struct MyValue data[] }

And it would result in invalid C. This can be worked around by typedef-ing the struct instead but it would force me to typedef pointers to values and I don't like how it "pollutes" my namespace. I'm also not that imaginative so if I can not name something I usually go that route.

Overall it's not a bad way of doing things, but my real gripe comes with the way that the logic is implemented.

#define vec_push(T, v, x)                                  \
   ({                                                      \
       vec(T) **_vp = (v);                                 \
       ssize_t _N = (*_vp)->N + 1;                         \
       ssize_t _S = _N * (ssize_t)sizeof((*_vp)->data[0])  \
               + (ssize_t)sizeof(vec(T));                  \
       if (!(*_vp = realloc(*_vp, _S))) abort();           \
       (*_vp)->N++;                                        \
       (*_vp)->data[_N - 1] = (x);                         \
   })

-- Martin Uecker, Generic Containers in C: vec

Having done that in the past, I now tend to avoid including too much logic inside my macros because I find that they lead to cryptic error messages and sometimes variable name clashes. (Again I suck I naming things) I've spent too much time grep-ing through the output of cpp and I've now switched to doing something else.

Hooper's way

I find that Hooper does container types in a very similar way to myself.

#define List(type) union { \
    ListNode *head; \
    type *payload; \
}
-- Daniel Hooper, Type Safe Generic Data Structures in C

Defining an unnamed union avoids the complex type problem we ran into with the other implementation, but as Hooper points out, without doing anything else, this would result in type errors when expanding the macro more than once.

From Hooper's article:

List(Foo) a;
List(Foo) b = a; // error

void my_function(List(Foo) list);
my_function(a); // error: incompatible type
Even though the variables have identical type definitions, the compiler still errors because they are two distinct definitions. A typedef avoids the issue:
typedef List(Foo) ListFoo; // this makes it all work

ListFoo a;
ListFoo b = a; // ok

void my_function(ListFoo list);
my_function(a); // ok

List(Foo) local_foo_list; // still works 
-- Daniel Hooper, Type Safe Generic Data Structures in C

I personally don't like how expansions of the same macro won't point back to the same type. C23 "fixed" this behaviour with it's named record equivalence rule, but for it to work we would need to make the name of the type part of the macro and we would run in the same issue with complex types.

My way

I do it much in the same way as Hooper. I declare a "base implementation" of my datastructure that every generic version will wrap.

// aligned on eights (or fours on 32 bit machines)
struct Vec {
    size_t len;
    size_t cap;
    void *data;
};

And a macro to define a new type of that datastructure.

#define VecDef(_type) \
    typedef struct { \
        struct Vec inner; \
        _type *phantom[0]; \
    }

Inserting the typedef directly in the macro allows me to define this type and to export it rather than re-expanding the same macro every time. The main drawback of this approach is that typedefs cannot be forward declared but structs can. Take note that the phantom field is a zero sized array of pointers to _type, this way I can forward declare _type. Also as a bonus, the zero size array is a zero-sized type (duh) and, in this case, adds no additional padding.

VecDef(int) IntVec;
VecDef(struct Pos) PosVec;

To then get some type safety, I make use of C11's _Generic keyword.

#define vecPush(vec, data) _Generic((data), typeof(**((vec)->phantom)): \
    vec_push(&(vec)->inner, sizeof(**((vec)->phantom)), &(data)))

By using _Generic here I'm able to check that the type passed in matches exactly the type expected.

VecDef(int) IntVec;
IntVec a = {};

char b = 10;
// Controlling expression type 'char' not compatible
// with any generic association type
vecPush(&a, b);

Hooper's way of type checking might be superior since the compiler will tell you which type it expected instead of just saying your the type is incompatible.

He uses the ternary operator to assert that both the parameter and the inner type match.

1 ? (param) : *(vec)->type

Sadly when it comes to reading a value out, I haven't found a way to have as much control, I instead cast the pointer, which works great for pointer types, but I open myself to C type casting rules if I try to dereference that pointer.

#define vecGetPtr(vec, idx) ((typeof(*(vec)->phantom))vec_get_ptr(\
    &(vec)->inner, sizeof(**((vec)->phantom)), idx))

IntVec a = {};
// incompatible pointer types
double *r1 = vecGetPtr(&a, 1);
// C silent type casting, meh
double r2 = *vecGetPtr(&a, 1);

This dovetails pretty nicely with C23's auto keyword were I basically never have to worry about type mismatch.

// r3 will always be the correct type
auto r3 = *vecGetPtr(&a, 1);

I've found that this technique works pretty well and I've been able to build all the reusable data-structure I've needed with it:

An Hashmap

#define HMapDef(type) \
    typedef struct { \
        struct HMap inner; \
        type *phantom[0]; \
    }

A Queue

#define QueueDef(type) \
    typedef struct { \
        struct Queue inner; \
        type *phantom[0]; \
    }

And many more.

The only generic data-structure I use not written in this way is my implementation of a primary queue, and I'm planning to rewrite it this way in order to make it type-safe, I just haven't taken the time to do it yet.

github gist with code examples used in this article

I wrote a bug and It made me reflect on OOP – It's not an OOP bashing post, surprisingly

I wrote a bug and It made me reflect on OOP

home articles data policy github rss

I wrote a bug and It made me reflect on OOP

The feature

I wrote a piece of code that would search for matches in a text block and try to correlate text position with line numbers.

let re = Regex::new('substr');
// impl Iterator
let source = re.find(text_block).map(|m| m.start());

let line_ends = [10, 14, 30];

for p in positions {
    let line_idx = line_ends.upper_bound(&p);
    ...
}

The implementation was made to be generic and accept any iterator of usize as a source. This made testing easier and I didn't have to specify the whole type of re.find(text_block).map(|m| m.start()) which is quite wordy.

A performance improvement

A few days after writing this patch, I found myself profiling the system, and this part of the system turned out to be a bit slow. You might have already noticed, but the regex always returns positions in sorted order: 1, 3, 6, 67.... This made it pretty easy to ignore parts of the line-end array that had already been searched.

let re = Regex::new('substr');
// impl Iterator
let source = re.find(text_block).map(|m| m.start());

let line_ends = [10, 14, 30];

let mut off = 0;

for p in positions {
    let line_idx = line_ends[off..].upper_bound(&p);
    off = line_idx;
    ...
}

The bug

Requirements changed, along with some code and a new way to search for text was added. It was still an iterator of usize and mostly returned positions in increasing order, so it appeared to work great. It was faster than the previous regex method, but it would sometimes return unsorted results sadly none of the test cases caught that behaviour, so we ended up missing some line indices from the returned value.

let source = SuffixFinder::new("ends with this", text_block);

let line_ends = [10, 14, 30];

let mut off = 0;

for p in positions {
    // `p` could sometimes be lower than the position
    // present at `line_ends[off]` because the
    // positions are not returned in sorted order
    let line_idx = line_ends[off..].upper_bound(&p);
    off = line_idx;
    ...
}

The fix was pretty simple but I kept thinking about this bug.

Why is OOP relevant to this discussion

When I was learning programming, a huge part of the curriculum was dedicated to so-called "object-oriented design". If I were to model the previous problem in terms of object inheritance and interfaces, it would look like this.

               _____________________________________________
              |              Interface Searcher             |
              | * fn find_line(line_ends: [usize]) -> usize |
              |___~_line_ends.upper_bound(self.p)___________|
 ___________________//________________________     ____\\______
|           Interface SortedSearcher          |   |SuffixFinder|
| * fn last_pos() -> usize                    |
| * fn find_line(line_ends: [usize]) -> usize |
|   ~ line_ends[self.last_pos()..]            |
|___~__________.upper_bound(self.p)___________|
           _____//____
          |RegexFinder|

fn find_position_line(s: Interface Searcher, line_ends: [usize]) -> [usize]
~   let lines = Vec::new();
~   loop
~       let p = s.find_line(line_ends);
~       if p == usize::MAX
~           return lines
~       lines.push(p)

The optimisation would be implemented using specialisations and only trigger when an object inherits from SortedSearcher. I would probably have had to write a newtype that would have wrapped the regex type since it comes from a library.

Generally, I dislike this type of modelling, it tends to result in poor data locality due to a lot of objects having to be allocated in languages like Java. And, in my opinion, it makes the code harder to think about, find_position_line function now depends on two abstract classes implementing parts of it's logic, so it requires a lot of jumping around in the code and every time logic is updated in the Searcher, SortedSearcher's implementation needs to be checked and/or updated.

But had I had taken the time to model multiple levels of interfaces and written the logic inside those interfaces (dependency injection) I would not have written that bug, and that bugs me.

You don't have to boot from just 512 bytes – As long as you boot from a CD

You don't have to boot from just 512 bytes

home articles data policy github rss

You don't have to boot from just 512 bytes

Wait, what?

Conventional wisdom says that you can only boot from the first sector of a floppy (512 bytes) or something that looks and behaves like the first sector of a floppy. But it doesn't have to be the case as long as you boot from a CD.

The "normal" booting process

Historically the IBM PC did not ship with a hard drive; it had a BASIC interpreter in its ROM and up to 2 floppy disk drives. If you wanted a proper operating system, the PC had to boot from a floppy containing an OS. The BIOS looked for the magic numbers [0x55, 0xAA] at the end of each floppy's first segment to detect if it could boot from it. Once a bootable drive was found, the segment was loaded into memory at address 0x7C00, and the CPU started executing at that address. When hard drives came along, a similar technique was used to boot from the MBR, but only 446 byes were available¹ compared to the floppies' 510.

Unsurprisingly, most modern PCs still support this booting mechanism. However, some manufacturers have started to remove support for legacy BIOS booting in favour of UEFI, but that's a story/rant for another time.

A Minimal Bootable ISO

A tiny bit of context

An ISO file is just a file containing an ISO 9660 file system which is the file system that CDs use. PCs they boot off CDs via the El Torito² extension to ISO 9660 standard.

The format is pretty straight forwards:

An ISO 9660 with EL TORITO extension
  (the bits to boot a PC at least)
  Offset
  0x0000_ _____________
         |    ....     |
         |     |
         |    ....     |
  0x8000_|_____________|
  0x8800_|_primary_vol_|
         |_boot_record_| --.
         |    ....     |    |
             |  addr
         |    ....     |    | of boot
         |_____________|    | catalog
         |__terminator_|    |
     .-- |_boot_catalog| <-´
     `-> |__boot_image_|
         |    ....     |
         ||
         |    ....     |
          ¯¯¯¯¯¯¯¯¯¯¯¯¯

The first 0x8000 bytes are unused, go wild and use them however you want. These bytes were left unused by the specification to allow for other booting systems to work on CDs. When "burning" an ISO to a thumb drive, this section generally contains an MBR or the UEFI equivalent.
The CD is segmented into fixed-sized segments, in most cases, 2048 bytes each.
The first segment used is at offset 0x8000, called the Primary Volume Descriptor.
The second segment at offset 0x8800 may be a Boot Record.
It is not required to read the CD's filesystem to boot from a CD.

That last point piqued my interest. If you only care about finding something that looks like a floppy and boot it, you can ignore most of the filesystem. I wanted to see just how little of the spec I had to implement to build a Minimal Bootable ISO.

El Torito basics

El Torito defines two sections of the CD, the Booting Catalog, which is comprised of multiple entries containing information about one or more bootable payloads. And the Boot Record Volume, which the BIOS uses to find the boot catalog.

Boot Record Volume Descriptor

          Boot Record Volume Descriptor
 _______________________________________________
|Offset_|__type___|____________Desc_____________|
|_0x000_|___u8____|__boot_record_indicator_=_0__|
| 0x001 |         |                             |
|  ...  | [u8; 5] | ISO-9660 identifier ="CD001"|
|_0x005_|_________|_____________________________|
|_0x006_|___u8____|_________version_=_1_________|
| 0x007 |         |   Boot system identifier    |
|  ...  | [u8;32] | ="EL TORITO SPECIFICATION"  |
|_0x026_|_________|_____________________________|
| 0x027 |         |                             |
|  ...  | [u8;32] |     Unused, "must" be 0     |
|_0x046_|_________|_____________________________|
| 0x047 |         |Sector id of the boot catalog|
|  ...  |   u32   |   sec_id * 2048 = offset    |
|_0x04a_|_________|_____________________________|
| 0x04b |         |                             |
|  ...  |[u8;1977]|     Unused, "must" be 0     |
| 0x7ff |         |                             |
 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    * All multi byte numbers are in little endian

The only value that changes is the sector id of the boot catalog (bytes 0x47 to 0x4a). Everything else is either zeroed or a magic value of some sort.

The Boot Catalog

The boot catalog defines where the boot payload(s) are located. And is stored across one or more segments and is composed of a series of entries.

                    The Boot Catalog
bytes 0x00 ........ 0x1f
 0x00 [Validation Entry] <- makes sure the data is not corrupted
 0x20 [  Initial Entry ] <- contains info about a boot payload
 0x40 [ Section Header ] <- info about section entries (optional)
 0x60 [ Section Entry 1] <- info about a boot image 1 (optional)
 0x80 [  Entry Ext 1   ] <- 13 bytes of data* (optional)
  --  |       :        |
 0x?? [ Section Entry N]
 0x?? [   Enty Ext N   ]

    * Multiple `Entry Ext` can be chained together.

For an ISO containing only one boot payload, we only need to consider the Validation Entry and the Initial Entry.

The Validation Entry

The validation entry is used to detect if the content is corrupted.

              Validation Entry
 ______________________________________________
|Offset|__type___|____________Desc_____________|
|_0x00_|___u8____|________header_id_=_1________|
|_0x01_|___u8____|____platform_id_=(1|2|3)_____|
| 0x02 |   u16   |     Unused, "must" be 0     |
|_0x03_|_________|_____________________________|
| 0x04 |         |                             |
|  ..  | [u8;24] |      manufacturer id        |
|_0x1b_|_________|_____________________________|
| 0x1c |   u16   |      checksum reserved      |
|_0x1d_|_________|_____________________________|
|_0x1e_|___u8____|____________0x55_____________|
| 0x1f |   u8    |            0xaa             |
 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Platform id is interesting because it was originally defined as:

#[repr(u8)]
enum PlatformId {
    x86     = 0x0,
    PowerPC = 0x1,
    Mac     = 0x2,
}

But Mac, in this case, the Mac platform pre-Intel, never implemented booting off a CD using El Torito. Although not in the standard, 0xef is commonly used to identify bootable images that rely on UEFI.

This is the enum I ended up using:

#[repr(u8)]
pub enum Platform {
    X86 = 0,
    PPC = 1,
    Mac = 2, // mac is never used ?
    UEFI = 0xef, // not part of the spec..
}

The other noteworthy field is checksum reserved. A checksum is computed by summing up the whole segment as a list of u16. This reserved u16 is used to ensure the sum wraps around to zero.

The Initial Entry

The second entry in the catalog is the initial entry; it contains info on a segment containing a bare metal 16-bit "real mode" executable and how to load it into memory.

                Initial Entry
 ______________________________________________
|Offset|__type___|____________Desc_____________|
|_0x00_|___u8____|_boot_indicator_=(0x88|0x00)_|
|_0x01_|___u8____|___boot_media_type_=(0..=4)__|
| 0x02 |   u16   |      Load Segment addr      |
|_0x03_|_________|_____________________________|
|_0x04_|___u8____|_________system_type_________|
|_0x05_|___u8____|_____Unused_"must"_be_0______|
| 0x06 |   u16   |       Sector Count          |
|_0x07_|_________|_____________________________|
| 0x08 |         |    Block address of the     |
|  ..  |   u32   |         bootloader          |
|_0x0b_|_________|_____________________________|
| 0x0c |         |                             |
|  ..  | [u8;17] |     Unused "must" be 0      |
| 0x1f |         |                             |
 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

A boot indicator of value 0x88 marks the entry as bootable, which, in practice, is almost always set. The boot media type lets the BIOS expose this sector to the executable as if it were a floppy, a hard drive or a CD. This lets older operating systems like DOS boot and read data from a CD as if it were a floppy without needing any extra drivers.

Sector count

Sector count tells the BIOS how many sectors of the emulated device it should load into memory. This lets you load more than one floppy segment. In CD mode, this would let you load up to 128MB of data into memory, crushing the merger 510B (if that) level 1 bootloaders need to restrict themselves to.

Booting a payload larger than 512 bytes

I uploaded the code I used to build my Minimal Bootable ISO to GitHub under https://github.com/lorlouis/iso9660. Calling make will create a disk image and run it via QEMU. src/bin/bootable.rs contains the steps to create the ISO. The steps loosely resemble:

Create a primary header

let primary_header = VD {
    ty: VDType::PrimaryVD,
    version: 1,
};

This is needed as SeaBIOS, the default i386 BIOS implementation in QEMU, checks to see if what's in the CD drive really is a CD.

Create a boot record of the El Torito variety

let boot_record = BootRecord::el_torito(18);

18 here denotes the sector 18 at which the boot catalog will be placed. The first 15 sectors are unused, the primary volume descriptor uses the 16th, and the 17th is the boot record, which leaves the 18th sector free.

Create a validation entry

let validation = ValidationEntry {
    header_id: 1,
    platform_id: Platform::X86,
    manufacturer_id: None,
};

SeaBIOS does not check the sector's checksum so the checksum reserved field is filled with 0s.

Create the initial entry

let initial = InitialEntry {
    boot_indicator: BootIndicator::Bootable,
    boot_media: BootMedia::Floppy1_44,
    load_segment: 0, // ie default value (I know it should be an option)
    sys_type: 0,  // no idea what it's supposed to be, idk it felt right
    sector_count: 4, // hmm intresting
    virtual_disk_addr: 19, // the last segment
};

sector_count is set to 4 because of the boot media emulation. Floppy sectors are 512 bytes long, and a CD sector is 2048 bytes long. It is possible to load more than that, but I did not see any need for this proof of concept.

The last step is to concatenate the files into an ISO

# create the 20 sectors required
dd if=/dev/zero of=$(ISO_FILE) count=20 bs=2048
# copy iso data in sector 17 and 18
dd if=$(ISO_DATA) of=$(ISO_FILE) seek=16 count=3 bs=2048 conv=notrunc
# copy stage 1
dd if=$(STAGE1_BIN) of=$(ISO_FILE) seek=$((19*4)) count=4 bs=512 conv=notrunc

The executable I loaded in the last sector was generated from this assembly

org 0x7c00 ; address at which the bios will load this executable
bits 16 ; 16 bit mode

    ; initialise pointers
    mov ax, 0
    mov ds, ax ; data segment 0
    mov ss, ax ; stack segment 0
    mov es, ax ; extra segment 0?
    mov sp, 0x7c00 ; set stack pointer at the start of this executable

_start:
    mov si, hello
    call puts
    jmp other ; jump into code after the 512th byte

; si=str, cl=strlen
puts:
    lodsb
    or al, al
    jz .done
    call putc
    jmp puts
.done:
    ret

; al=char
putc:
    mov ah, 0eh
    int 10h
    ret

hello: db 'hello world!', 10, 13, 0
hello_len: equ $-hello

meme: db 'hello meme!', 0
meme_len: equ $-meme

times 510 - ($ - $$) db 0 ; fill with 0s until bytes 511
db 0x55, 0xaa ; mark the sector as bootable by setting the bytes 511 and 512

other:
    mov si, meme
    call puts
    hlt

times 2048 - ($ - $$) db 0 ; fill the rest of the disk sector with 0s

Is this even remotely useful?

No.

Most of the questions asking how to boot off more than 512 bytes come from people trying to avoid writing a multi-stage bootloader, even though there are many benefits to separating your bootloader in stages. This article details a quirk of booting off a CD on the PC platform. None of it applies to ISOs burnt to USB drives or booting from a hard drive and thus would still require you to implement multi-stage booting.

Building a blog with Rust in 2023 – (the stupid way)

Building a blog with Rust in 2023

home articles data policy github rss

Building a blog with rust in 2023

As per the name, this blog is imperfect. I built it with the idea of making it as easy as possible for me to get something released without bikeshedding too much. Building it in rust made it easier to do that, but it still came at a cost.

Requirements

There were a few thing I was not willing to compromise on:

I don't want to have to write HTML to write blog posts, markdown all the way
I don't want to have to deal with NGINX (too many knobs to tune)
I don't want to rely on any JavaScript (I'm already making you read bad prose, I can't also have you run awful JS)
No PHP (it's personal I just don't like to write PHP)
HTTP and HTTPS

Nice to have

Only one binary with minimal configuration
Adding an article should not require me to restart the server
It should run on Linode's base tier machine AKA: a toaster
I'd like to be able to mix code and templates (PHP style)
No need for docker
It should be readable in Links

The options I considered

Any static site generator build around Markdown

Let's be honest. This blog is mainly composed of static pages. I could have gotten away with using a static website generator and hosting the HTML files on GitHub or something. But I wanted to avoid having individual HTML pages for each page of . I looked into a couple of options, but honestly, it looked more fun to build my own thing than to use someone else's.

Briefly considering Python

I looked into using Python, more specifically Flask, as I have used it in the past, but having maintained Python projects, they tend to rot. Python lacking a way to pin down a version of a dependency, and Pip being generally unhygienic; I found that trying to get a Python project deployed outside a Docker container is more complicated than it should be. And I did not want to have to use Docker. I know there are ways to make it work, but I did not want to end up in dependency hell, and there was something else I wanted to try...

What about rewriting it in Rust?

I've been working in Rust for the last year or so at my $JOB, and I must say, it has grown on me. Some parts of the languages are not as mature as I'd like them to be (custom allocators, async traits, etc), but overall I'd say it's a good replacement for projects where you'd typically reach for C++ or Java. I've heard a few people talking about using it to build server backends, and I wanted to learn more about the state of Rust frameworks for the web.

The popular Rust web frameworks

Early on, I learned about different web frameworks, mostly through Flosse's rust web framework comparison rust web framework comparison. The list makes it easy to know if a given framework support a common feature, and I encourage anyone who thinks about using Rust to build web applications to give it a look. It contains a list of both frontend frameworks and backend frameworks. The frontend frameworks compile to WebAssembly; even though it's not JavaScript, I still wanted to stay clear from requiring the user to run code to display this website.

Rocket

Even though Rocket is marked as "outdated" in Flosse's list, development seems to still be going strong. In fact, the most recent commits are only about 2 weeks old, at the time of writing. Rocket is very much a "batteries included" type of framework. I was quickly able to get an early version of the page going, but I ended up not using it because I found the number of dependencies a bit too high and the build times (on my old decaying laptop) too slow for my liking. It looks to be a great framework that comes with everything you would need to build complex websites with forms and stuff, but it felt overkill for my usage.

Warp

I was looking for a framework that would be a tad smaller. Having used warp at my $JOB before, I briefly considered it. Warp is built on top of Rust Generics and its type system. This means that a lot of it feels magic, just add a few filters and some serde::Deserialize implementing types, and you'll have a working API endpoint in no time... Except that warp, due to its liberal use of generics, contributes a lot to the overall time it takes to build our projects. But my biggest gripe with warp is that when things go wrong (which is a compile-time, at least) it generates compile errors that compete with some of the worse C++ template errors I've had the displeasure of seeing.

Actix Web

Actix Web describes itself as a "micro-framework" (much like flask is often described). It handles routing, HTTP/1, HTTP/2, HTTPS and typed HTML queries (q?key=value). The only thing I needed was templating. It required an async main since it's built on top of tokio, but it does not manage every part of the program the same way Rocket would. To me, it felt easier to compose with other libraries, so I stuck with it.

HTML templating

When I was experimenting with Rocket, I also tried handlebars as the main templating engine. It worked well, but to me, it felt awkward to have the code and the format in 2 different places. The last time I did any kind of web development was back in college, most of which was done with PHP. Although I don't really like PHP, there was one thing I really liked (and apparently other people don't): you can mix HTML and code.

typed-html

When I found typed-html it seemed to be exactly what I was looking for, I could embed HTML with the html! macro and I could use rust expressions within that macro to build web pages server-side. The first page I build was the page and I quickly ran into a limitation of typed-html, due to typed-html's goal of making it easy to build correct HTML through type safety it won't allow you to use a code block as the first child of certain tags.


 { /* rust code */ } 



    "A title"
    { /* rust code */ }

This is done so that It can guarantee a certain level of correctness, IE: no s in s. I wanted to have functions to define common headers and common footers for each page, but this limitation made it pretty awkward. One thing I took away from the experiment (probably the wrong one based on the library's name): I could use Rust macros to embed arbitrary tokens in my Rust code.

Building my own version of the `html!` macro

I wanted to be able to reference variables and evaluate expressions, not just enumerations, in { } brackets. I found typed-html pretty limiting, and I hit maximum recursion a few times while trying to build fairly simple pages; I had to write my own. I won't go into details, but I made the code available on GitHub under https://github.com/lorlouis/html_template, the code is definitely not perfect, but it worked well enough to build this blog.

With that, I had all the elements I needed to build this blog.

Here's part of the code I use to turn a Markdown file into an article

...
let body: Root = html!{
    
    
        
            {common_head(real_title.clone(), author.cloned(), blurb.cloned())}
        
        
            
            { common_header() }
            
            
            { markdown.to_html() }
            
            
            { common_footer() }
            
        
    
}.into();
...

Final Thoughts

In the end, I built a fairly unsophisticated blog using mostly pre-existing libraries. The downside of this approach is that while I was paying attention to not pull-in too many dependencies, I now depend on 168 external dependencies. Using Actic-web made routing and handling query parameters really easy. I'm also glad I built html_template as it was the first time I had ever used Rust's proc-macros, and it made building HTML pages in-code much easier.

An imperfect blog – for the sake of getting it out-there

An imperfect blog

home articles data policy github rss

An imperfect blog

I have trouble finishing projects, my Projects/ directory is filled with half-finished parsers, barely working game engines, three slightly different versions of a web server and more than a dozen other side projects I started but never finished. This blog is no exception; I started considering having my own website two years ago, and in the meantime, the only things I built towards that goal are:

A web server written in C that I gave up on while trying to handle multiple connections per thread
A few CGI scripts to learn how CGI works (
And a barely functioning templating system built around the C preprocessor

As you can see, I made zero progress on actually writing a blog. I thought about it and decided I should start finishing projects more often.

The average life cycle of a project

      Get a project idea
              ¦ <---------------.
              v                  \
 Try and build the "hard" part    \
        /             \            ¦
   [success]       [failure]       ¦
       v               v          /
Get bored and move   learn more  /
to another project   on the subject

    ??? -> Finished project

Whenever I start a new project, I tend to try to build what I consider the "hardest" part first. This means that when I made a 2.5D "engine" à la Wolfenstein, I only wrote the bare minimum to get to the code that renders a maze. This meant that I learned a lot about how to write a raycaster, and I even got it to "pan" the camera up and down. But once I got that working, I completely lost interest and started a new project to learn about some other concept I thought was interesting at the time.

Chasing perfection

Usually, when I work on a project, it is to learn about some concept I heard of recently. 99% of my projects are never indented to be shared, but it did not use to be the case. I used to put whatever I was working on, finished or not, in a public repo on my GitHub. As I learned more, I became critical of my code. It allowed me to improve my craft considerably, but it came at a cost: rereading a piece of code a few months after having written, makes me ashamed.

Getting myself to finish something

I don't want to push myself to finish every project I start; otherwise I'll end up not starting any new projects. I think I should prioritise certain projects, such as this blog, and get them released.

If you can read this, I managed to get something imperfect out the door.

Louis' imperfect blog

How I do (type-safe) container types in C – type-safe(r) container types

How I do (type-safe) container types in C

Why am I not satisfied with these two articles

Uecker's way

Hooper's way

My way

I wrote a bug and It made me reflect on OOP – It's not an OOP bashing post, surprisingly

I wrote a bug and It made me reflect on OOP

The feature

A performance improvement

The bug

Why is OOP relevant to this discussion

You don't have to boot from just 512 bytes – As long as you boot from a CD

You don't have to boot from just 512 bytes

Wait, what?

The "normal" booting process

A Minimal Bootable ISO

A tiny bit of context

El Torito basics

Boot Record Volume Descriptor

The Boot Catalog

The Validation Entry

The Initial Entry

Sector count

Booting a payload larger than 512 bytes

Is this even remotely useful?

Footnotes

Building a blog with Rust in 2023 – (the stupid way)

Building a blog with rust in 2023

Requirements

Nice to have

The options I considered

Any static site generator build around Markdown

Briefly considering Python

What about rewriting it in Rust?

The popular Rust web frameworks

Rocket

Warp

Actix Web

HTML templating

typed-html

"A title"

Building my own version of the html! macro

Final Thoughts

An imperfect blog – for the sake of getting it out-there

An imperfect blog

The average life cycle of a project

Chasing perfection

Getting myself to finish something

Building my own version of the `html!` macro