Fixing LDC UDA Recognition And Multiple Definition Errors

Dec 4, 2025 by Alex Johnson 58 views

Magic UDAs Don't Get Recognized by uda.cpp in LDC

Have you ever encountered a peculiar issue in LDC where your User-Defined Attributes (UDAs) seem to vanish into thin air, only to be followed by a confusing ‘multiple definition’ error from the linker? You’re not alone! This article dives deep into a specific bug where LDC’s uda.cpp fails to recognize certain UDAs, leading to conflicts when symbols are defined in multiple compilation units. We’ll explore why this happens, how it manifests, and potential solutions to keep your D code compiling smoothly.

The Enigma of Undetected UDAs

This issue primarily surfaces when you’re working with UDAs defined using immutable, particularly when they are declared in separate modules and then linked together. The core of the problem lies in how LDC processes these attributes. Let’s break down the provided code example to understand the mechanics.

In test1.d, we have a simple global boolean variable:

extern(C) __gshared bool test = true;
void main() {}

This sets up a symbol named test that is globally shared and intended for C linkage. The main function is present just to make the file compilable.

Now, let's look at test2.d:

module test2;
import ldc.attributes : weak;

extern(C) __gshared @weak bool test = false;
pragma(msg, __traits(getAttributes, test)); // this triggers the error

Here, we import the weak attribute from ldc.attributes. Crucially, we define the same global variable test but with the @weak attribute and initialize it to false. The line pragma(msg, __traits(getAttributes, test)); is the trigger. When LDC compiles test2.d and encounters this pragma, it attempts to inspect the attributes of the test variable. This is where the breakdown occurs.

The `uda.cpp` Conundrum

LDC's internal compiler logic, specifically in gen/uda.cpp, is responsible for scanning symbols and identifying their associated attributes (UDAs). The standard approach involves checking if an expression is a StructLiteralExp. However, the problem arises because, by the time uda.cpp checks these attributes, the expressionSemantic phase might have already run. This phase transforms the attribute expression from its literal form (like a struct literal) into a VarExp (a reference to a variable).

When uda.cpp encounters a VarExp instead of the expected StructLiteralExp, it fails to recognize the UDA. This leads to the attribute not being correctly associated with the symbol. Consequently, when LDC attempts to link test1.o and test2.o using the command ldc2 -c test2.d && ldc2 test1.d test2.o, the linker detects that the symbol test has been defined twice: once in test1.o (typically in the .data section) and again in test2.o (often in the .bss section due to __gshared). Since the UDA (@weak) was not properly processed, the linker doesn't have the necessary information to resolve this conflict, resulting in the error:

AliasSeq!(_weak())
/usr/bin/ld: test2.o:(.bss.test+0x0): multiple definition of `test'; test1.o:(.data.test+0x0): first defined here
collect2: error: ld returned 1 exit status

The AliasSeq!(_weak()) part is a clue from the compiler's internal representation, showing that it saw something related to _weak but didn't fully resolve or apply it as an attribute.

Why `immutable` Matters

The observation points to a specific LDC internal structure: UDAs are often defined using immutable within ldc.attributes.d. For instance:

immutable weak = _weak();
private struct _weak {}

This pattern involves an immutable variable weak holding an instance of a private struct _weak. The immutable keyword plays a role in how these attributes are represented and processed during compilation. Changing immutable to enum can fix the issue, suggesting that the exact type and mutability of the attribute definition might affect the expressionSemantic phase and subsequent UDA recognition.

However, there's likely a good reason why immutable was initially chosen for these internal LDC attributes. It might relate to performance, compile-time evaluation guarantees, or consistency with other internal compiler mechanisms. Simply changing it without understanding the implications could introduce new, perhaps more subtle, problems.

Reproducing the Error: A Step-by-Step Guide

To solidify your understanding, let's walk through reproducing this error. You'll need the LDC compiler installed.

Create test1.d: Save the following code into a file named test1.d:
```
extern(C) __gshared bool test = true;
void main() {}
```

Create test2.d: Save the following code into a file named test2.d:

module test2;
import ldc.attributes : weak;

extern(C) __gshared @weak bool test = false;
pragma(msg, __traits(getAttributes, test));

Compile and Link: Open your terminal or command prompt, navigate to the directory where you saved the files, and execute the following commands:
```
ldc2 -c test1.d
ldc2 -c test2.d
ldc2 test1.o test2.o -o myprogram
```
- The first command compiles test1.d into test1.o without linking.
- The second command compiles test2.d into test2.o. This is where the pragma(msg) will be evaluated, and if the issue is present, you might see compiler warnings or errors related to attribute processing before the linker error occurs, or just the linker error directly if the compiler doesn't report it explicitly.
- The third command attempts to link test1.o and test2.o together into an executable named myprogram. This is where the multiple definition error will definitely manifest.

Expected Outcome:

Instead of a successful executable, you will receive the linker error:

AliasSeq!(_weak())
/usr/bin/ld: test2.o:(.bss.test+0x0): multiple definition of `test'; test1.o:(.data.test+0x0): first defined here
collect2: error: ld returned 1 exit status

This clearly indicates that the linker found two definitions for test, and the compiler failed to correctly apply the @weak attribute from test2.d to resolve this conflict.

Understanding the Root Cause: Semantic Analysis and Attribute Resolution

The crux of the problem lies in the timing and nature of LDC's semantic analysis and attribute handling. When the compiler processes code, it goes through several phases. One crucial phase is semantic analysis, where the compiler checks for type correctness, variable declarations, scopes, and resolves symbols. Another critical part is the UDA resolution, which identifies and applies user-defined attributes to declarations.

The Role of `expressionSemantic`

In D, attributes are often represented as expressions. For instance, @weak internally might correspond to a struct or a function call that returns a special type representing the attribute. The expressionSemantic function is responsible for taking these attribute expressions and determining their meaning and type within the context of the declaration they are attached to. It performs checks, resolves types, and potentially transforms the expression into a more canonical form.

When you have code like extern(C) __gshared @weak bool test = false;, the @weak part is an expression. The expressionSemantic function is called to analyze this expression. If this phase runs before the UDA scanning mechanism in uda.cpp looks for attributes, it can change the representation of the attribute.

How `uda.cpp` Scans for Attributes

The uda.cpp file contains logic for iterating through the Abstract Syntax Tree (AST) of the code and identifying declarations that have associated UDAs. A key part of this logic, as mentioned in the bug report, is checking isStructLiteralExp(). This check is designed to identify attributes that are directly represented as struct literals in the AST. For example, if a UDA was defined like @MyAttr where MyAttr is a simple struct, this check works well.

However, the issue arises when expressionSemantic has already processed the attribute expression. If expressionSemantic transforms the attribute expression (e.g., @weak) into something else, like a VarExp (which represents a variable reference), the isStructLiteralExp() check in uda.cpp will fail. It no longer sees a direct struct literal but rather a reference to something else. This means the UDA is effectively missed by this specific scanning mechanism.

The `pragma(msg, __traits(getAttributes, test))` Trigger

The pragma(msg, __traits(getAttributes, test)) line is a clever way to expose this internal compiler behavior. __traits(getAttributes, test) asks the compiler to report all attributes associated with the test variable at compile time. For this trait to work correctly, the compiler must have successfully identified and processed all attributes during semantic analysis. The fact that this pragma triggers the problem suggests that the evaluation of __traits(getAttributes, ...) forces the compiler to perform attribute resolution earlier or in a way that highlights the discrepancy between the semantic analysis phase and the uda.cpp scanning logic.

When the compiler tries to get the attributes for test, it needs to know what they are. If the uda.cpp mechanism missed them due to the VarExp issue, the trait might report nothing, or worse, the internal state might be inconsistent, leading to further compilation issues or the pragma itself forcing a resolution that reveals the underlying problem.

The `immutable` vs. `enum` Distinction

The observation that changing immutable to enum fixes the issue is significant. In D, immutable and enum have different semantics regarding compile-time evaluation and mutability. immutable variables are read-only after initialization and can be initialized with complex expressions, potentially involving runtime computations that are then frozen. enum values, on the other hand, are typically compile-time constants and often simpler in nature.

It's plausible that the immutable weak = _weak(); construct, when processed by expressionSemantic, results in an AST node that uda.cpp doesn't recognize (perhaps it becomes a VarExp pointing to the immutable variable weak). However, if enum weak = _weak(); were used, the resulting AST node might remain a StructLiteralExp or be handled differently, allowing uda.cpp to correctly identify it.

Why would LDC use immutable? It could be that the internal representations of attributes are designed to be constant and shareable, and immutable provides a strong guarantee for this. It might also be related to how LDC integrates with LLVM's attribute system, which often deals with constant metadata.

Potential Workarounds and Solutions

While this issue points to a potential bug within LDC's compiler internals, there are ways to navigate around it or address it:

Avoid @weak with __gshared across Modules: The most direct workaround is to avoid using @weak (or potentially other UDAs exhibiting similar behavior) on __gshared variables that are defined in multiple modules and intended to be linked. If possible, refactor your code to have a single definition or manage the linkage differently.
Use enum for Attribute Definitions (with Caution): As noted, changing immutable to enum for the internal definition of the UDA might resolve the uda.cpp recognition issue. However, this should be done with extreme caution. If immutable was chosen for specific reasons (e.g., interaction with LLVM, compile-time evaluation guarantees), changing it might have unintended consequences elsewhere in the compiler or affect performance. This is generally a solution for compiler developers rather than end-users, unless you are certain about the implications.
Report the Bug to LDC: This behavior, especially the reliance on isStructLiteralExp and the interaction with expressionSemantic and __traits, strongly suggests a bug in LDC. Reporting this issue on the LDC issue tracker (e.g., on GitHub) is crucial. Provide the minimal reproducible example (test1.d, test2.d, and the compile command) as done in the initial report. This allows the LDC developers to investigate, fix the underlying problem in uda.cpp or the semantic analysis phases, and ensure better UDA handling in the future.
Alternative Linker Flags: Sometimes, linker behavior can be influenced by flags. While less likely to be a direct fix for a compiler UDA recognition bug, exploring linker flags related to symbol definition merging or weak symbol handling (--Wl,--allow-multiple-definition is generally discouraged as it hides errors, but --Wl,--defsym or similar might offer alternatives for specific scenarios) could be considered, though this is more of a workaround for the linker error than the compiler bug.
Compile as a Single Unit: If feasible for your build system, compiling all involved source files in a single ldc2 invocation (ldc2 test1.d test2.d -o myprogram) might bypass the issue. When compiled together, the compiler has a more global view and might correctly associate attributes without generating separate object files that cause linking conflicts. However, this doesn't scale well for larger projects and doesn't fix the root cause.

Conclusion

The interaction between LDC's UDA processing, semantic analysis, and the linker can sometimes lead to complex and confusing errors. The specific bug where uda.cpp fails to recognize attributes due to transformations by expressionSemantic highlights the intricate nature of compiler design. While workarounds exist, the most constructive approach is to contribute to the LDC project by reporting such issues, providing clear reproducible examples, and helping the developers enhance the compiler's robustness.

Understanding these compiler internals not only helps in debugging but also deepens our appreciation for the complex machinery that turns our source code into executable programs. For more insights into D language features and LDC specifics, you can explore the official D Language Foundation website or dive into the LDC compiler's GitHub repository.

The Enigma of Undetected UDAs

The uda.cpp Conundrum

Why immutable Matters