A utility program called an assembler is used to translate assembly language statements into the target computer's machine code. The assembler performs a more or less isomorphic translation (a one-to-one mapping) from mnemonic statements into machine instructions and data. (This is in contrast with high-level languages, in which a single statement generally results in many machine instructions.)
Many sophisticated assemblers offer additional mechanisms to facilitate program development, control the assembly process, and aid debugging. In particular, most modern assemblers (although many have been available for more than 40 years already) include a macro facility (described below), and are called macro assemblers.
Typically a modern assembler creates object code by translating assembly instruction mnemonics into opcodes, and by resolving symbolic names for memory locations and other entities.[1] The use of symbolic references is a key feature of assemblers, saving tedious calculations and manual address updates after program modifications. Most assemblers also include macro facilities for performing textual substitution—e.g., to generate common short sequences of instructions to run inline, instead of in a subroutine.
Assemblers are generally simpler to write than compilers for high-level languages, and have been available since the 1950s. Modern assemblers, especially for RISC based architectures, such as MIPS, Sun SPARC, HP PA-RISC and x86(-64), optimize instruction scheduling to exploit the CPU pipeline efficiently.
More sophisticated high-level assemblers provide language abstractions such as:
- Advanced control structures
- High-level procedure/function declarations and invocations
- High-level abstract data types, including structures/records, unions, classes, and sets
- Sophisticated macro processing
- Object-Oriented features such as encapsulation, polymorphism, inheritance, interfaces
Current usage
There have always been debates over the usefulness and performance of assembly language relative to high-level languages. Assembly language has specific niche uses where it is important; see below. But in general, modern optimizing compilers are claimed to render high-level languages into code that can run as fast as hand-written assembly, despite some counter-examples that can be created. The complexity of modern processors makes effective hand-optimization increasingly difficult.[6] Moreover, and to the dismay of efficiency lovers, increasing processor performance has meant that most CPUs sit idle most of the time, with delays caused by predictable bottlenecks such as I/O operations and paging. This has made raw code execution speed a non-issue for most programmers.
Here are some situations in which practitioners might choose to use assembly language:
- When a stand-alone binary executable is required, i.e. one that must execute without recourse to the run-time components or libraries associated with a high-level language; this is perhaps the most common situation. These are embedded programs that store only a small amount of memory and the device is intended to do single purpose tasks. Such examples consist of telephones, automobile fuel and ignition systems, air-conditioning control systems, security systems, and sensors.
- When interacting directly with the hardware, for example in device drivers.
- When using processor-specific instructions not exploited by or available to the compiler. A common example is the bitwise rotation instruction at the core of many encryption algorithms.
- Embedded systems.
- When extreme optimization is required, e.g., in an inner loop in a processor-intensive algorithm. Some game programmers are experts at writing code that takes advantage of the capabilities of hardware features in systems enabling the games to run faster.
- When a system with severe resource constraints (e.g., an embedded system) must be hand-coded to maximize the use of limited resources; but this is becoming less common as processor price/performance improves
- When no high-level language exists, e.g., on a new or specialized processor
- Real-time programs that need precise timing and responses, such as simulations, flight navigation systems, and medical equipment. (For example, in a fly-by-wire system, telemetry must be interpreted and acted upon within strict time constraints. Such systems must eliminate sources of unpredictable delays – such as may be created by interpreted languages, automatic garbage collection, paging operations, or preemptive multitasking. Some higher-level languages incorporate run-time components and operating system interfaces that can introduce such delays. Choosing assembly or lower-level languages for such systems gives the programmer greater visibility and control over processing details.)
- When complete control over the environment is required (for example in extremely high security situations, where nothing can be taken for granted).
- When writing computer viruses, bootloaders, certain device drivers, or other items very close to the hardware or low-level operating system.
- When reverse-engineering existing binaries, which may or may not have originally been written in a high-level language, for example when cracking copy protection of proprietary software.
- Reverse engineering and modification of video games (known as ROM Hacking), commonly done to games for Nintendo hardware such as the SNES and NES, is possible with a range of techniques, of which the most widely employed is altering the program code at the assembly language level.
- Assembly language lends itself well to applications requiring Self modifying code.
- Assembly language is sometimes used for writing games and other software for graphing calculators.[7]
- Finally, compiler writers usually write software that generates assembly code, and should therefore be expert assembly language programmers themselves.
Nevertheless, assembly language is still taught in most Computer Science and Electronic Engineering programs. Although few programmers today regularly work with assembly language as a tool, the underlying concepts remain very important. Such fundamental topics as binary arithmetic, memory allocation, stack processing, character set encoding, interrupt processing, and compiler design would be hard to study in detail without a grasp of how a computer operates at the hardware level. Since a computer's behavior is fundamentally defined by its instruction set, the logical way to learn such concepts is to study an assembly language. Most modern computers have similar instruction sets. Therefore, studying a single assembly language is sufficient to learn: i) The basic concepts; ii) To recognize situations where the use of assembly language might be appropriate; and iii) To see how efficient executable code can be created from high-level languages.
Hard-coded assembly language is typically used in a system's boot ROM (BIOS on IBM-compatible PC systems). This low-level code is used, among other things, to initialize and test the system hardware prior to booting the OS, and is stored in ROM. Once a certain level of hardware initialization has taken place, execution transfers to other code, typically written in higher level languages; but the code running immediately after power is applied is usually written in assembly language. The same is true of most boot loaders.
Many compilers render high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes. Relatively low-level languages, such as C, often provide special syntax to embed assembly language directly in the source code. Programs using such facilities, such as the Linux kernel, can then construct abstractions utilizing different assembly language on each hardware platform. The system's portable code can then utilize these processor-specific components through a uniform interface.
Assembly language is also valuable in reverse engineering, since many programs are distributed only in machine code form, and machine code is usually easy to translate into assembly language and carefully examine in this form, but very difficult to translate into a higher-level language. Tools such as the Interactive Disassembler make extensive use of disassembly for such a purpose.
A particular niche that makes use of assembly language is the demoscene. Certain competitions require the contestants to restrict their creations to a very small size (e.g. 256B, 1KB, 4KB or 64 KB), and assembly language is the language of choice to achieve this goal.[9] When resources, particularly CPU-processing constrained systems, like the earlier Amiga models, and the Commodore 64, are a concern, assembler coding is a must: optimized assembler code is written "by hand" and instructions are sequenced manually by the coders in an attempt to minimize the number of CPU cycles used; the CPU constraints are so great that every CPU cycle counts. However, using such techniques has enabled systems like the Commodore 64 to produce real-time 3D graphics with advanced effects, a feat which might be considered unlikely or even impossible for a system with a 0.99MHz processor.
Q1. Explain the flowchart of pass-2 of a two pass assembler.