reinstalling jetpack 4.4 official on the jetson xavier nx

I reinstalled JetPack 4.4 on my NX. It wasn’t like I had a choice in the matter. A couple of nights back I picked up the latest Ubuntu 18.04/L4T updates, and when it rebooted, it refused to come back up. After all that work playing with it, and with no backup, I pulled the current image from the Nvidia website and started over again.

Which was, in hind sight, actually a Good Thing. The Developer’s Preview had picked up a bit of cruft from all my experimentation. Being forced to recreate the boot image wasn’t all that horrible, it just cost me a detour and a chunk of my time I really didn’t want to have to give up.

It wasn’t all that bad, actually. After all, you flash a micro SDXC card, poke it into the NX, and apply power. Fortunately for me the bits I cared about were on my blog, and I used my other Linux notebook to basically copy my home directory to a 64GB thumb drive, from which I cherry-picked the bits I want to move over to the new install.

One of the oddball problems I ran into was re-installing Deno. On the 4.4 Developer’s Preview I’d installed Deno via Rust’s cargo, and got Deno version 1.0.2 up and running. This time, Deno had moved up to version 1.1.3 and failed to build via cargo. I’ve gotten Deno back by pulling the source tarball off of Deno’s github site. I discovered what the cargo build failure was by watching the source tarball build: The module deno_lint is 0.1.16 from cargo, and it fails all over the place with unresolved calls. The module deno_lint in the source tarball is version 0.1.15, and it builds. After the build, I pushed my binary over to ~./local/bin (which is in my path) and carried on using Deno.

Oh, and I pulled the latest Emacs from their git site, and I’m now running with version 28.0.50. Some things are actually a smidge better with this release.

my very first aarch64 assembly program

I have been thinking for some number of years of learning how to write ARM assembly language applications. The driver for this is an old book, “Threaded Interpretive Languages: Their Design and Implementation” by R. G. Loeliger. I purchased a copy back in 1981, the year the book was first published, at a local Book Stop in Atlanta. A threaded interpretive language, or TIL, is the general classification for Forth and other languages like it. I wasn’t yet married, was still something of a night owl, and so I had time on my hands so to speak to investigate how to implement a TIL. Over a period of three years I implemented the same personal TIL on the Z80 (the book was written to use a Z80), an MOS 6502, a Motorola 68000, and an Intel 8086. By the time I put the book aside I was doing a lot more “serious” work in C and Pascal on DEC minis and IBM PCs. The only time I went back to writing any assembly was later in the 1990s when I wrote a fair bit of code on the side for a 65c802 and an Intel 80c196. Both of these were custom designed embedded systems. Unfortunately I didn’t write a TIL for either one, as the requirements directed me to spend my time on other functionality.

So this weekend I decided to dig a bit into the ARM processor driving the Nvidia Xavier NX. I went around the net a bit looking for examples and tutorials, and finally managed to cobble together a “Hello World” program in aarch64/ARMv8 assembly. Here’s the tiny application I wrote.

#include "include.h"

	.global	_start
	.text

_start:
	mov	x8, __NR_write
	mov x2, hello_len
	adr x1, hello_txt
	mov	x0, STDOUT_FILENO
	svc	0

	mov	x8, __NR_exit
	svc	0

	.data

hello_txt:	.ascii "Hello, World!\n"
hello_len = . - hello_txt

I’m following, for the most part, this tutorial: https://modexp.wordpress.com/2018/10/30/arm64-assembly/, “A Guide to ARM64 / AArch64 Assembly on Linux with Shellcodes and Cryptography.” The code and instructions for how to assemble it are towards the middle. My only comment about the program is that line 9 in my listing is different from the original. I found that if I wanted to load the register with the address to the string to print, then I needed to explicitly code the mnemonic. For whatever reason the tools on the Xavier simply interpreted it as a mov instruction, and nothing would print.

Because of the number of steps involved in building the app, I wrote a bit of Python 3 to automate the process a bit. My Python code turned out to be longer than my assembly code.

#!/usr/bin/env python3

import argparse
import os
from pathlib import Path
import subprocess
import sys

if not sys.version_info.minor >= 6:
    print("You are using Python version {}.{}.{}".
        format(sys.version_info.major,sys.version_info.minor,sys.version_info.micro))
    print("Python version 3.6.0 or higher is required.")
    sys.exit(1)

parser = argparse.ArgumentParser()
parser.add_argument("source", help="Assembly source file name is required.")
args = parser.parse_args()

if not os.path.isfile(args.source):
    print("File {} can't be found.".format(args.source))
    sys.exit(1)

filestem = Path(args.source).stem

preprocess = "cpp -E {} -o {}.as".format(args.source, filestem)
print(preprocess)
p = subprocess.run([preprocess], shell=True)
if p.returncode != 0:
    sys.exit(1)

assemble = "as {}.as -o {}.o".format(filestem, filestem)
print(assemble)
p = subprocess.run([assemble], shell=True)
if p.returncode != 0:
    sys.exit(1)

link = "ld {}.o -o {}".format(filestem, filestem)
print(link)
p = subprocess.run([link], shell=True)

There are no comments, and only a little white space to make it readable to me. I really didn’t feel like diving into either make or cmake. Hopefully I haven’t embarresed myself too much with either program.

What I’ve discovered so far is that there are a lot of 32-bit ARM assembly tutorials that won’t work at all with the Xavier. They just won’t assemble. But I am moving along a bit after this. I have Loeliger’s inner and outer interpreter coded, and two words in a dictionary. I’ll begin to post this effort shortly. As for why, well, why not? If nothing else, this hello program is 1,104 bytes long, which beats the size of Go’s basic hello world program by, what, four orders of magnitude?

There’s just something bracingly honest about writing in assembly that no other method of coding can approach.