What would it take to dynamically load an elf shared object at runtime and execute functions, without compile-time knowledge of the function? There are approximately 3300 shared object files on my computer. Imagine having runtime access to all that functionality without needing a compilation step.
#include <dlfcn.h>
Gives access to dlopen(). Two problems:
dlsym() requires you to know the name of the symbol.How can we programmatically assemble parameters to prepare to call a function? We can use libffi to assemble function parameters for calling. From this example:
ffi_call(&cif, function_pointer, &rc, values);
Where function_pointer is something we got from dlopen().
nm -D /usr/lib/x86_64-linux-gnu/libgtk-3.so | grep ' T '
But how to do this programmatically? What does nm itself use?
ldd /usr/bin/nm
    linux-vdso.so.1 (0x00007fff1b773000)
    libbfd-2.28-system.so => /usr/lib/x86_64-linux-gnu/libbfd-2.28-system.so (0x00007f4324599000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f432437f000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f432417b000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4323ddc000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f4324aed000)
The libbfd looks interesting. From the link here it seems to do what we might expect. From some random stackoverflow answer:
#include <bfd.h>
bfd *abfd;
asection *p;
char *filename = "/path/to/my/file";
if ((abfd = bfd_openr(filename, NULL)) == NULL) {
    /* ... error handling */
}
if (!bfd_check_format (abfd, bfd_object)) {
    /* ... error handling */
}
for (p = abfd->sections; p != NULL; p = p->next) {
    bfd_vma  base_addr = bfd_section_vma(abfd, p);
    bfd_size_type size = bfd_section_size (abfd, p);
    const char   *name = bfd_section_name(abfd, p);
    flagword     flags = bfd_get_section_flags(abfd, p);
    if (flags & SEC_CODE) {
        printf("%s: addr=%p size=%d\n", name, base_addr, size);
    }
}
Here is where things get a little tricky. The shared objects themselves don't have information about types. For example a struct may be defined but when it's compiled that sub-type information may get lost and the function simply receives a pointer to some blob in memory. From what I can tell the only way to know is to parse the header file.
Some prior art:
Could use GCC::TranslationUnit Perl module... Which uses -fdump-translation-unit flag:
gcc `pkg-config --cflags gtk+-3.0` `pkg-config --libs gtk+-3.0` -fdump-translation-unit -o test test.c
# head test.c.001t.tu
@1      type_decl        name: @2       type: @3       chain: @4      
@2      identifier_node  strg: int      lngt: 3       
@3      integer_type     name: @1       size: @5       algn: 32      
                         prec: 32       sign: signed   min : @6      
                         max : @7      
@4      type_decl        name: @8       type: @9       chain: @10     
@5      integer_cst      type: @11     int: 32
@6      integer_cst      type: @3      int: -2147483648
@7      integer_cst      type: @3      int: 2147483647
@8      identifier_node  strg: char     lngt: 4
An example for the GtkWindow struct:
@52321  identifier_node  strg: _GtkWindow              lngt: 10
@52322  identifier_node  strg: bin      lngt: 3
@52323  record_type      name: @52373   unql: @52374   size: @1878
                         algn: 64       tag : struct   flds: @52375
@52324  field_decl       name: @9177    type: @52376   scpe: @52273
                         srcp: gtkwindow.h:57          size: @22
                         algn: 64       bpos: @1878
From the header:
struct _GtkWindow
{
  GtkBin bin;
  GtkWindowPrivate *priv;
};
Looking at record type @52373:
@52373  type_decl        name: @52430   type: @52323   scpe: @154
                         srcp: gtkbin.h:45             chain: @52431
It's name is @52430:
@52430  identifier_node  strg: GtkBin   lngt: 6
Looking at size: @1878:
@1878   integer_cst      type: @11     int: 384
And type: @11:
@11     integer_type     name: @18      size: @19      algn: 128
                         prec: 128      sign: unsigned min : @20
                         max : @21
So it looks like this translation unit file has all the information needed to understand what size ints are being used, map all the typedefs back to their core types, etc. Advantages are that gcc is doing all the preprocessing and parsing of the text. Disadvantages are that you need to compile something to get this dump, and then you need to parse the dump. This could be something done once per target architecture and then stored in some DB.
What process ids are dynamically linking to a particular object?
lsof /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
What objects are linked to a given process id?
lsof -p 22159 | grep .so
It's theoretically possible to dynamically link and dynamically call functions from shared objects (naturally, computers are flexible). The main blocker to this is the knowledge of types for function calls. What I would like is a C library that can somehow provide detailed type information for a given object. This could be made either by parsing or compiling header files which would prove difficult because:
The most fruitful path here would be to use parts of LLVM. However we seem to be missing a pure C version of this functionality. Another avenue is to use existing tools in a once-off operation, and save this information to some global database. This could even be a SQL database like SQLite, making it really transparent. I am tempted to make this database and provide it online for all distributed libraries in Debian. It would need the following dimensions:
For example, there would potentially need to be a separate definition for a combination of all of the above. An example:
"x86_64" + "libX11.so" + "6.3.0" + "XCreateWindow" + "DEFAULT"
Which would identify a specific symbol in the shared object. For that symbol there would be a structured machine readable record describing various types and alignments. Software could read this definition, dynamically load the library and assemble function arguments or structures based on this information. Programs could download the SQLite file of this database, or select a subset of the database for their architecture or list of libs. This database could also be installed via Debian packaging.