[deliverable/linux.git] / Documentation / devicetree / usage-model.txt

Linux and the Device Tree
-------------------------
The Linux usage model for device tree data

Author: Grant Likely <grant.likely@secretlab.ca>

This article describes how Linux uses the device tree.  An overview of
the device tree data format can be found on the device tree usage page
at devicetree.org[1].

[1] http://devicetree.org/Device_Tree_Usage

The "Open Firmware Device Tree", or simply Device Tree (DT), is a data
structure and language for describing hardware.  More specifically, it
is a description of hardware that is readable by an operating system
so that the operating system doesn't need to hard code details of the
machine.

Structurally, the DT is a tree, or acyclic graph with named nodes, and
nodes may have an arbitrary number of named properties encapsulating
arbitrary data.  A mechanism also exists to create arbitrary
links from one node to another outside of the natural tree structure.

Conceptually, a common set of usage conventions, called 'bindings',
is defined for how data should appear in the tree to describe typical
hardware characteristics including data busses, interrupt lines, GPIO
connections, and peripheral devices.

As much as possible, hardware is described using existing bindings to
maximize use of existing support code, but since property and node
names are simply text strings, it is easy to extend existing bindings
or create new ones by defining new nodes and properties.  Be wary,
however, of creating a new binding without first doing some homework
about what already exists.  There are currently two different,
incompatible, bindings for i2c busses that came about because the new
binding was created without first investigating how i2c devices were
already being enumerated in existing systems.

1. History
----------
The DT was originally created by Open Firmware as part of the
communication method for passing data from Open Firmware to a client
program (like to an operating system).  An operating system used the
Device Tree to discover the topology of the hardware at runtime, and
thereby support a majority of available hardware without hard coded
information (assuming drivers were available for all devices).

Since Open Firmware is commonly used on PowerPC and SPARC platforms,
the Linux support for those architectures has for a long time used the
Device Tree.

In 2005, when PowerPC Linux began a major cleanup and to merge 32-bit
and 64-bit support, the decision was made to require DT support on all
powerpc platforms, regardless of whether or not they used Open
Firmware.  To do this, a DT representation called the Flattened Device
Tree (FDT) was created which could be passed to the kernel as a binary
blob without requiring a real Open Firmware implementation.  U-Boot,
kexec, and other bootloaders were modified to support both passing a
Device Tree Binary (dtb) and to modify a dtb at boot time.  DT was
also added to the PowerPC boot wrapper (arch/powerpc/boot/*) so that
a dtb could be wrapped up with the kernel image to support booting
existing non-DT aware firmware.

Some time later, FDT infrastructure was generalized to be usable by
all architectures.  At the time of this writing, 6 mainlined
architectures (arm, microblaze, mips, powerpc, sparc, and x86) and 1
out of mainline (nios) have some level of DT support.

2. Data Model
-------------
If you haven't already read the Device Tree Usage[1] page,
then go read it now.  It's okay, I'll wait....

2.1 High Level View
-------------------
The most important thing to understand is that the DT is simply a data
structure that describes the hardware.  There is nothing magical about
it, and it doesn't magically make all hardware configuration problems
go away.  What it does do is provide a language for decoupling the
hardware configuration from the board and device driver support in the
Linux kernel (or any other operating system for that matter).  Using
it allows board and device support to become data driven; to make
setup decisions based on data passed into the kernel instead of on
per-machine hard coded selections.

Ideally, data driven platform setup should result in less code
duplication and make it easier to support a wide range of hardware
with a single kernel image.

Linux uses DT data for three major purposes:
1) platform identification,
2) runtime configuration, and
3) device population.

2.2 Platform Identification
---------------------------
First and foremost, the kernel will use data in the DT to identify the
specific machine.  In a perfect world, the specific platform shouldn't
matter to the kernel because all platform details would be described
perfectly by the device tree in a consistent and reliable manner.
Hardware is not perfect though, and so the kernel must identify the
machine during early boot so that it has the opportunity to run
machine-specific fixups.

In the majority of cases, the machine identity is irrelevant, and the
kernel will instead select setup code based on the machine's core
CPU or SoC.  On ARM for example, setup_arch() in
arch/arm/kernel/setup.c will call setup_machine_fdt() in
arch/arm/kernel/devicetree.c which searches through the machine_desc
table and selects the machine_desc which best matches the device tree
data.  It determines the best match by looking at the 'compatible'
property in the root device tree node, and comparing it with the
dt_compat list in struct machine_desc.

The 'compatible' property contains a sorted list of strings starting
with the exact name of the machine, followed by an optional list of
boards it is compatible with sorted from most compatible to least.  For
example, the root compatible properties for the TI BeagleBoard and its
successor, the BeagleBoard xM board might look like:

	compatible = "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3";
	compatible = "ti,omap3-beagleboard-xm", "ti,omap3450", "ti,omap3";

Where "ti,omap3-beagleboard-xm" specifies the exact model, it also
claims that it compatible with the OMAP 3450 SoC, and the omap3 family
of SoCs in general.  You'll notice that the list is sorted from most
specific (exact board) to least specific (SoC family).

Astute readers might point out that the Beagle xM could also claim
compatibility with the original Beagle board.  However, one should be
cautioned about doing so at the board level since there is typically a
high level of change from one board to another, even within the same
product line, and it is hard to nail down exactly what is meant when one
board claims to be compatible with another.  For the top level, it is
better to err on the side of caution and not claim one board is
compatible with another.  The notable exception would be when one
board is a carrier for another, such as a CPU module attached to a
carrier board.

One more note on compatible values.  Any string used in a compatible
property must be documented as to what it indicates.  Add
documentation for compatible strings in Documentation/devicetree/bindings.

Again on ARM, for each machine_desc, the kernel looks to see if
any of the dt_compat list entries appear in the compatible property.
If one does, then that machine_desc is a candidate for driving the
machine.  After searching the entire table of machine_descs,
setup_machine_fdt() returns the 'most compatible' machine_desc based
on which entry in the compatible property each machine_desc matches
against.  If no matching machine_desc is found, then it returns NULL.

The reasoning behind this scheme is the observation that in the majority
of cases, a single machine_desc can support a large number of boards
if they all use the same SoC, or same family of SoCs.  However,
invariably there will be some exceptions where a specific board will
require special setup code that is not useful in the generic case.
Special cases could be handled by explicitly checking for the
troublesome board(s) in generic setup code, but doing so very quickly
becomes ugly and/or unmaintainable if it is more than just a couple of
cases.

Instead, the compatible list allows a generic machine_desc to provide
support for a wide common set of boards by specifying "less
compatible" value in the dt_compat list.  In the example above,
generic board support can claim compatibility with "ti,omap3" or
"ti,omap3450".  If a bug was discovered on the original beagleboard
that required special workaround code during early boot, then a new
machine_desc could be added which implements the workarounds and only
matches on "ti,omap3-beagleboard".

PowerPC uses a slightly different scheme where it calls the .probe()
hook from each machine_desc, and the first one returning TRUE is used.
However, this approach does not take into account the priority of the
compatible list, and probably should be avoided for new architecture
support.

2.3 Runtime configuration
-------------------------
In most cases, a DT will be the sole method of communicating data from
firmware to the kernel, so also gets used to pass in runtime and
configuration data like the kernel parameters string and the location
of an initrd image.

Most of this data is contained in the /chosen node, and when booting
Linux it will look something like this:

	chosen {
		bootargs = "console=ttyS0,115200 loglevel=8";
		initrd-start = <0xc8000000>;
		initrd-end = <0xc8200000>;
	};

The bootargs property contains the kernel arguments, and the initrd-*
properties define the address and size of an initrd blob.  The
chosen node may also optionally contain an arbitrary number of
additional properties for platform-specific configuration data.

During early boot, the architecture setup code calls of_scan_flat_dt()
several times with different helper callbacks to parse device tree
data before paging is setup.  The of_scan_flat_dt() code scans through
the device tree and uses the helpers to extract information required
during early boot.  Typically the early_init_dt_scan_chosen() helper
is used to parse the chosen node including kernel parameters,
early_init_dt_scan_root() to initialize the DT address space model,
and early_init_dt_scan_memory() to determine the size and
location of usable RAM.

On ARM, the function setup_machine_fdt() is responsible for early
scanning of the device tree after selecting the correct machine_desc
that supports the board.

2.4 Device population
---------------------
After the board has been identified, and after the early configuration data
has been parsed, then kernel initialization can proceed in the normal
way.  At some point in this process, unflatten_device_tree() is called
to convert the data into a more efficient runtime representation.
This is also when machine-specific setup hooks will get called, like
the machine_desc .init_early(), .init_irq() and .init_machine() hooks
on ARM.  The remainder of this section uses examples from the ARM
implementation, but all architectures will do pretty much the same
thing when using a DT.

As can be guessed by the names, .init_early() is used for any machine-
specific setup that needs to be executed early in the boot process,
and .init_irq() is used to set up interrupt handling.  Using a DT
doesn't materially change the behaviour of either of these functions.
If a DT is provided, then both .init_early() and .init_irq() are able
to call any of the DT query functions (of_* in include/linux/of*.h) to
get additional data about the platform.

The most interesting hook in the DT context is .init_machine() which
is primarily responsible for populating the Linux device model with
data about the platform.  Historically this has been implemented on
embedded platforms by defining a set of static clock structures,
platform_devices, and other data in the board support .c file, and
registering it en-masse in .init_machine().  When DT is used, then
instead of hard coding static devices for each platform, the list of
devices can be obtained by parsing the DT, and allocating device
structures dynamically.

The simplest case is when .init_machine() is only responsible for
registering a block of platform_devices.  A platform_device is a concept
used by Linux for memory or I/O mapped devices which cannot be detected
by hardware, and for 'composite' or 'virtual' devices (more on those
later).  While there is no 'platform device' terminology for the DT,
platform devices roughly correspond to device nodes at the root of the
tree and children of simple memory mapped bus nodes.

About now is a good time to lay out an example.  Here is part of the
device tree for the NVIDIA Tegra board.

/{
	compatible = "nvidia,harmony", "nvidia,tegra20";
	#address-cells = <1>;
	#size-cells = <1>;
	interrupt-parent = <&intc>;

	chosen { };
	aliases { };

	memory {
		device_type = "memory";
		reg = <0x00000000 0x40000000>;
	};

	soc {
		compatible = "nvidia,tegra20-soc", "simple-bus";
		#address-cells = <1>;
		#size-cells = <1>;
		ranges;

		intc: interrupt-controller@50041000 {
			compatible = "nvidia,tegra20-gic";
			interrupt-controller;
			#interrupt-cells = <1>;
			reg = <0x50041000 0x1000>, < 0x50040100 0x0100 >;
		};

		serial@70006300 {
			compatible = "nvidia,tegra20-uart";
			reg = <0x70006300 0x100>;
			interrupts = <122>;
		};

		i2s1: i2s@70002800 {
			compatible = "nvidia,tegra20-i2s";
			reg = <0x70002800 0x100>;
			interrupts = <77>;
			codec = <&wm8903>;
		};

		i2c@7000c000 {
			compatible = "nvidia,tegra20-i2c";
			#address-cells = <1>;
			#size-cells = <0>;
			reg = <0x7000c000 0x100>;
			interrupts = <70>;

			wm8903: codec@1a {
				compatible = "wlf,wm8903";
				reg = <0x1a>;
				interrupts = <347>;
			};
		};
	};

	sound {
		compatible = "nvidia,harmony-sound";
		i2s-controller = <&i2s1>;
		i2s-codec = <&wm8903>;
	};
};

At .init_machine() time, Tegra board support code will need to look at
this DT and decide which nodes to create platform_devices for.
However, looking at the tree, it is not immediately obvious what kind
of device each node represents, or even if a node represents a device
at all.  The /chosen, /aliases, and /memory nodes are informational
nodes that don't describe devices (although arguably memory could be
considered a device).  The children of the /soc node are memory mapped
devices, but the codec@1a is an i2c device, and the sound node
represents not a device, but rather how other devices are connected
together to create the audio subsystem.  I know what each device is
because I'm familiar with the board design, but how does the kernel
know what to do with each node?

The trick is that the kernel starts at the root of the tree and looks
for nodes that have a 'compatible' property.  First, it is generally
assumed that any node with a 'compatible' property represents a device
of some kind, and second, it can be assumed that any node at the root
of the tree is either directly attached to the processor bus, or is a
miscellaneous system device that cannot be described any other way.
For each of these nodes, Linux allocates and registers a
platform_device, which in turn may get bound to a platform_driver.

Why is using a platform_device for these nodes a safe assumption?
Well, for the way that Linux models devices, just about all bus_types
assume that its devices are children of a bus controller.  For
example, each i2c_client is a child of an i2c_master.  Each spi_device
is a child of an SPI bus.  Similarly for USB, PCI, MDIO, etc.  The
same hierarchy is also found in the DT, where I2C device nodes only
ever appear as children of an I2C bus node.  Ditto for SPI, MDIO, USB,
etc.  The only devices which do not require a specific type of parent
device are platform_devices (and amba_devices, but more on that
later), which will happily live at the base of the Linux /sys/devices
tree.  Therefore, if a DT node is at the root of the tree, then it
really probably is best registered as a platform_device.

Linux board support code calls of_platform_populate(NULL, NULL, NULL, NULL)
to kick off discovery of devices at the root of the tree.  The
parameters are all NULL because when starting from the root of the
tree, there is no need to provide a starting node (the first NULL), a
parent struct device (the last NULL), and we're not using a match
table (yet).  For a board that only needs to register devices,
.init_machine() can be completely empty except for the
of_platform_populate() call.

In the Tegra example, this accounts for the /soc and /sound nodes, but
what about the children of the SoC node?  Shouldn't they be registered
as platform devices too?  For Linux DT support, the generic behaviour
is for child devices to be registered by the parent's device driver at
driver .probe() time.  So, an i2c bus device driver will register a
i2c_client for each child node, an SPI bus driver will register
its spi_device children, and similarly for other bus_types.
According to that model, a driver could be written that binds to the
SoC node and simply registers platform_devices for each of its
children.  The board support code would allocate and register an SoC
device, a (theoretical) SoC device driver could bind to the SoC device,
and register platform_devices for /soc/interrupt-controller, /soc/serial,
/soc/i2s, and /soc/i2c in its .probe() hook.  Easy, right?

Actually, it turns out that registering children of some
platform_devices as more platform_devices is a common pattern, and the
device tree support code reflects that and makes the above example
simpler.  The second argument to of_platform_populate() is an
of_device_id table, and any node that matches an entry in that table
will also get its child nodes registered.  In the tegra case, the code
can look something like this:

static void __init harmony_init_machine(void)
{
	/* ... */
	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
}

"simple-bus" is defined in the ePAPR 1.0 specification as a property
meaning a simple memory mapped bus, so the of_platform_populate() code
could be written to just assume simple-bus compatible nodes will
always be traversed.  However, we pass it in as an argument so that
board support code can always override the default behaviour.

[Need to add discussion of adding i2c/spi/etc child devices]

Appendix A: AMBA devices
------------------------

ARM Primecells are a certain kind of device attached to the ARM AMBA
bus which include some support for hardware detection and power
management.  In Linux, struct amba_device and the amba_bus_type is
used to represent Primecell devices.  However, the fiddly bit is that
not all devices on an AMBA bus are Primecells, and for Linux it is
typical for both amba_device and platform_device instances to be
siblings of the same bus segment.

When using the DT, this creates problems for of_platform_populate()
because it must decide whether to register each node as either a
platform_device or an amba_device.  This unfortunately complicates the
device creation model a little bit, but the solution turns out not to
be too invasive.  If a node is compatible with "arm,amba-primecell", then
of_platform_populate() will register it as an amba_device instead of a
platform_device.
Commit	Line	Data
31134efc GL	1	Linux and the Device Tree
	2	-------------------------
	3	The Linux usage model for device tree data
	4
	5	Author: Grant Likely <grant.likely@secretlab.ca>
	6
	7	This article describes how Linux uses the device tree. An overview of
	8	the device tree data format can be found on the device tree usage page
	9	at devicetree.org[1].
	10
	11	[1] http://devicetree.org/Device_Tree_Usage
	12
	13	The "Open Firmware Device Tree", or simply Device Tree (DT), is a data
	14	structure and language for describing hardware. More specifically, it
	15	is a description of hardware that is readable by an operating system
	16	so that the operating system doesn't need to hard code details of the
	17	machine.
	18
	19	Structurally, the DT is a tree, or acyclic graph with named nodes, and
	20	nodes may have an arbitrary number of named properties encapsulating
	21	arbitrary data. A mechanism also exists to create arbitrary
	22	links from one node to another outside of the natural tree structure.
	23
	24	Conceptually, a common set of usage conventions, called 'bindings',
	25	is defined for how data should appear in the tree to describe typical
	26	hardware characteristics including data busses, interrupt lines, GPIO
	27	connections, and peripheral devices.
	28
	29	As much as possible, hardware is described using existing bindings to
	30	maximize use of existing support code, but since property and node
	31	names are simply text strings, it is easy to extend existing bindings
	32	or create new ones by defining new nodes and properties. Be wary,
	33	however, of creating a new binding without first doing some homework
	34	about what already exists. There are currently two different,
	35	incompatible, bindings for i2c busses that came about because the new
	36	binding was created without first investigating how i2c devices were
	37	already being enumerated in existing systems.
	38
	39	1. History
	40	----------
	41	The DT was originally created by Open Firmware as part of the
	42	communication method for passing data from Open Firmware to a client
	43	program (like to an operating system). An operating system used the
	44	Device Tree to discover the topology of the hardware at runtime, and
	45	thereby support a majority of available hardware without hard coded
	46	information (assuming drivers were available for all devices).
	47
	48	Since Open Firmware is commonly used on PowerPC and SPARC platforms,
	49	the Linux support for those architectures has for a long time used the
	50	Device Tree.
	51
	52	In 2005, when PowerPC Linux began a major cleanup and to merge 32-bit
	53	and 64-bit support, the decision was made to require DT support on all
	54	powerpc platforms, regardless of whether or not they used Open
	55	Firmware. To do this, a DT representation called the Flattened Device
	56	Tree (FDT) was created which could be passed to the kernel as a binary
	57	blob without requiring a real Open Firmware implementation. U-Boot,
	58	kexec, and other bootloaders were modified to support both passing a
	59	Device Tree Binary (dtb) and to modify a dtb at boot time. DT was
	60	also added to the PowerPC boot wrapper (arch/powerpc/boot/*) so that
	61	a dtb could be wrapped up with the kernel image to support booting
	62	existing non-DT aware firmware.
	63
	64	Some time later, FDT infrastructure was generalized to be usable by
65	all architectures. At the time of this writing, 6 mainlined
66	architectures (arm, microblaze, mips, powerpc, sparc, and x86) and 1
67	out of mainline (nios) have some level of DT support.
68
69	2. Data Model
70	-------------
71	If you haven't already read the Device Tree Usage[1] page,
72	then go read it now. It's okay, I'll wait....
73
74	2.1 High Level View
75	-------------------
76	The most important thing to understand is that the DT is simply a data
77	structure that describes the hardware. There is nothing magical about
78	it, and it doesn't magically make all hardware configuration problems
79	go away. What it does do is provide a language for decoupling the
80	hardware configuration from the board and device driver support in the
81	Linux kernel (or any other operating system for that matter). Using
82	it allows board and device support to become data driven; to make
83	setup decisions based on data passed into the kernel instead of on
84	per-machine hard coded selections.
85
86	Ideally, data driven platform setup should result in less code
87	duplication and make it easier to support a wide range of hardware
88	with a single kernel image.
89
90	Linux uses DT data for three major purposes:
91	1) platform identification,
92	2) runtime configuration, and
93	3) device population.
94
95	2.2 Platform Identification
96	---------------------------
97	First and foremost, the kernel will use data in the DT to identify the
98	specific machine. In a perfect world, the specific platform shouldn't
99	matter to the kernel because all platform details would be described
100	perfectly by the device tree in a consistent and reliable manner.
101	Hardware is not perfect though, and so the kernel must identify the
102	machine during early boot so that it has the opportunity to run
103	machine-specific fixups.
104
105	In the majority of cases, the machine identity is irrelevant, and the
106	kernel will instead select setup code based on the machine's core
107	CPU or SoC. On ARM for example, setup_arch() in
108	arch/arm/kernel/setup.c will call setup_machine_fdt() in
109	arch/arm/kernel/devicetree.c which searches through the machine_desc
110	table and selects the machine_desc which best matches the device tree
111	data. It determines the best match by looking at the 'compatible'
112	property in the root device tree node, and comparing it with the
113	dt_compat list in struct machine_desc.
114
115	The 'compatible' property contains a sorted list of strings starting
116	with the exact name of the machine, followed by an optional list of
117	boards it is compatible with sorted from most compatible to least. For
118	example, the root compatible properties for the TI BeagleBoard and its
119	successor, the BeagleBoard xM board might look like:
120
121	compatible = "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3";
122	compatible = "ti,omap3-beagleboard-xm", "ti,omap3450", "ti,omap3";
123
124	Where "ti,omap3-beagleboard-xm" specifies the exact model, it also
125	claims that it compatible with the OMAP 3450 SoC, and the omap3 family
126	of SoCs in general. You'll notice that the list is sorted from most
127	specific (exact board) to least specific (SoC family).
128
129	Astute readers might point out that the Beagle xM could also claim
130	compatibility with the original Beagle board. However, one should be
131	cautioned about doing so at the board level since there is typically a
132	high level of change from one board to another, even within the same
133	product line, and it is hard to nail down exactly what is meant when one
134	board claims to be compatible with another. For the top level, it is
135	better to err on the side of caution and not claim one board is
136	compatible with another. The notable exception would be when one
137	board is a carrier for another, such as a CPU module attached to a
138	carrier board.
139
140	One more note on compatible values. Any string used in a compatible
141	property must be documented as to what it indicates. Add
142	documentation for compatible strings in Documentation/devicetree/bindings.
143
144	Again on ARM, for each machine_desc, the kernel looks to see if
145	any of the dt_compat list entries appear in the compatible property.
146	If one does, then that machine_desc is a candidate for driving the
147	machine. After searching the entire table of machine_descs,
148	setup_machine_fdt() returns the 'most compatible' machine_desc based
149	on which entry in the compatible property each machine_desc matches
150	against. If no matching machine_desc is found, then it returns NULL.
151
152	The reasoning behind this scheme is the observation that in the majority
153	of cases, a single machine_desc can support a large number of boards
154	if they all use the same SoC, or same family of SoCs. However,
155	invariably there will be some exceptions where a specific board will
156	require special setup code that is not useful in the generic case.
157	Special cases could be handled by explicitly checking for the
158	troublesome board(s) in generic setup code, but doing so very quickly
159	becomes ugly and/or unmaintainable if it is more than just a couple of
160	cases.
161
162	Instead, the compatible list allows a generic machine_desc to provide
163	support for a wide common set of boards by specifying "less
164	compatible" value in the dt_compat list. In the example above,
165	generic board support can claim compatibility with "ti,omap3" or
166	"ti,omap3450". If a bug was discovered on the original beagleboard
167	that required special workaround code during early boot, then a new
168	machine_desc could be added which implements the workarounds and only
169	matches on "ti,omap3-beagleboard".
170
171	PowerPC uses a slightly different scheme where it calls the .probe()
172	hook from each machine_desc, and the first one returning TRUE is used.
173	However, this approach does not take into account the priority of the
174	compatible list, and probably should be avoided for new architecture
175	support.
176
177	2.3 Runtime configuration
178	-------------------------
179	In most cases, a DT will be the sole method of communicating data from
180	firmware to the kernel, so also gets used to pass in runtime and
181	configuration data like the kernel parameters string and the location
182	of an initrd image.
183
184	Most of this data is contained in the /chosen node, and when booting
185	Linux it will look something like this:
186
187	chosen {
188	bootargs = "console=ttyS0,115200 loglevel=8";
189	initrd-start = <0xc8000000>;
190	initrd-end = <0xc8200000>;
191	};
192
193	The bootargs property contains the kernel arguments, and the initrd-*
194	properties define the address and size of an initrd blob. The
195	chosen node may also optionally contain an arbitrary number of
196	additional properties for platform-specific configuration data.
197
198	During early boot, the architecture setup code calls of_scan_flat_dt()
199	several times with different helper callbacks to parse device tree
200	data before paging is setup. The of_scan_flat_dt() code scans through
201	the device tree and uses the helpers to extract information required
202	during early boot. Typically the early_init_dt_scan_chosen() helper
203	is used to parse the chosen node including kernel parameters,
204	early_init_dt_scan_root() to initialize the DT address space model,
205	and early_init_dt_scan_memory() to determine the size and
206	location of usable RAM.
207
208	On ARM, the function setup_machine_fdt() is responsible for early
209	scanning of the device tree after selecting the correct machine_desc
210	that supports the board.
211
212	2.4 Device population
213	---------------------
214	After the board has been identified, and after the early configuration data
215	has been parsed, then kernel initialization can proceed in the normal
216	way. At some point in this process, unflatten_device_tree() is called
217	to convert the data into a more efficient runtime representation.
218	This is also when machine-specific setup hooks will get called, like
219	the machine_desc .init_early(), .init_irq() and .init_machine() hooks
220	on ARM. The remainder of this section uses examples from the ARM
221	implementation, but all architectures will do pretty much the same
222	thing when using a DT.
223
224	As can be guessed by the names, .init_early() is used for any machine-
225	specific setup that needs to be executed early in the boot process,
226	and .init_irq() is used to set up interrupt handling. Using a DT
227	doesn't materially change the behaviour of either of these functions.
228	If a DT is provided, then both .init_early() and .init_irq() are able
229	to call any of the DT query functions (of_* in include/linux/of*.h) to
230	get additional data about the platform.
231
232	The most interesting hook in the DT context is .init_machine() which
233	is primarily responsible for populating the Linux device model with
234	data about the platform. Historically this has been implemented on
235	embedded platforms by defining a set of static clock structures,
236	platform_devices, and other data in the board support .c file, and
237	registering it en-masse in .init_machine(). When DT is used, then
238	instead of hard coding static devices for each platform, the list of
239	devices can be obtained by parsing the DT, and allocating device
240	structures dynamically.
241
242	The simplest case is when .init_machine() is only responsible for
243	registering a block of platform_devices. A platform_device is a concept
244	used by Linux for memory or I/O mapped devices which cannot be detected
245	by hardware, and for 'composite' or 'virtual' devices (more on those
246	later). While there is no 'platform device' terminology for the DT,
247	platform devices roughly correspond to device nodes at the root of the
248	tree and children of simple memory mapped bus nodes.
249
250	About now is a good time to lay out an example. Here is part of the
251	device tree for the NVIDIA Tegra board.
252
253	/{
254	compatible = "nvidia,harmony", "nvidia,tegra20";
255	#address-cells = <1>;
256	#size-cells = <1>;
257	interrupt-parent = <&intc>;
258
259	chosen { };
260	aliases { };
261
262	memory {
263	device_type = "memory";
264	reg = <0x00000000 0x40000000>;
265	};
266
267	soc {
268	compatible = "nvidia,tegra20-soc", "simple-bus";
269	#address-cells = <1>;
270	#size-cells = <1>;
271	ranges;
272
273	intc: interrupt-controller@50041000 {
274	compatible = "nvidia,tegra20-gic";
275	interrupt-controller;
276	#interrupt-cells = <1>;
277	reg = <0x50041000 0x1000>, < 0x50040100 0x0100 >;
278	};
279
280	serial@70006300 {
281	compatible = "nvidia,tegra20-uart";
282	reg = <0x70006300 0x100>;
283	interrupts = <122>;
284	};
285
286	i2s1: i2s@70002800 {
287	compatible = "nvidia,tegra20-i2s";
288	reg = <0x70002800 0x100>;
289	interrupts = <77>;
290	codec = <&wm8903>;
291	};
292
293	i2c@7000c000 {
294	compatible = "nvidia,tegra20-i2c";
295	#address-cells = <1>;
296	#size-cells = <0>;
297	reg = <0x7000c000 0x100>;
298	interrupts = <70>;
299
300	wm8903: codec@1a {
301	compatible = "wlf,wm8903";
302	reg = <0x1a>;
303	interrupts = <347>;
304	};
305	};
306	};
307
308	sound {
309	compatible = "nvidia,harmony-sound";
310	i2s-controller = <&i2s1>;
311	i2s-codec = <&wm8903>;
312	};
313	};
314
5d781108	315	At .init_machine() time, Tegra board support code will need to look at
31134efc GL	316	this DT and decide which nodes to create platform_devices for.
	317	However, looking at the tree, it is not immediately obvious what kind
	318	of device each node represents, or even if a node represents a device
	319	at all. The /chosen, /aliases, and /memory nodes are informational
	320	nodes that don't describe devices (although arguably memory could be
	321	considered a device). The children of the /soc node are memory mapped
	322	devices, but the codec@1a is an i2c device, and the sound node
	323	represents not a device, but rather how other devices are connected
	324	together to create the audio subsystem. I know what each device is
	325	because I'm familiar with the board design, but how does the kernel
	326	know what to do with each node?
	327
	328	The trick is that the kernel starts at the root of the tree and looks
	329	for nodes that have a 'compatible' property. First, it is generally
	330	assumed that any node with a 'compatible' property represents a device
	331	of some kind, and second, it can be assumed that any node at the root
	332	of the tree is either directly attached to the processor bus, or is a
	333	miscellaneous system device that cannot be described any other way.
	334	For each of these nodes, Linux allocates and registers a
	335	platform_device, which in turn may get bound to a platform_driver.
	336
	337	Why is using a platform_device for these nodes a safe assumption?
	338	Well, for the way that Linux models devices, just about all bus_types
	339	assume that its devices are children of a bus controller. For
	340	example, each i2c_client is a child of an i2c_master. Each spi_device
	341	is a child of an SPI bus. Similarly for USB, PCI, MDIO, etc. The
	342	same hierarchy is also found in the DT, where I2C device nodes only
	343	ever appear as children of an I2C bus node. Ditto for SPI, MDIO, USB,
	344	etc. The only devices which do not require a specific type of parent
	345	device are platform_devices (and amba_devices, but more on that
	346	later), which will happily live at the base of the Linux /sys/devices
	347	tree. Therefore, if a DT node is at the root of the tree, then it
	348	really probably is best registered as a platform_device.
	349
155dd0c2	350	Linux board support code calls of_platform_populate(NULL, NULL, NULL, NULL)
31134efc GL	351	to kick off discovery of devices at the root of the tree. The
	352	parameters are all NULL because when starting from the root of the
	353	tree, there is no need to provide a starting node (the first NULL), a
	354	parent struct device (the last NULL), and we're not using a match
	355	table (yet). For a board that only needs to register devices,
	356	.init_machine() can be completely empty except for the
	357	of_platform_populate() call.
	358
	359	In the Tegra example, this accounts for the /soc and /sound nodes, but
	360	what about the children of the SoC node? Shouldn't they be registered
	361	as platform devices too? For Linux DT support, the generic behaviour
	362	is for child devices to be registered by the parent's device driver at
	363	driver .probe() time. So, an i2c bus device driver will register a
	364	i2c_client for each child node, an SPI bus driver will register
	365	its spi_device children, and similarly for other bus_types.
	366	According to that model, a driver could be written that binds to the
	367	SoC node and simply registers platform_devices for each of its
	368	children. The board support code would allocate and register an SoC
	369	device, a (theoretical) SoC device driver could bind to the SoC device,
	370	and register platform_devices for /soc/interrupt-controller, /soc/serial,
	371	/soc/i2s, and /soc/i2c in its .probe() hook. Easy, right?
	372
	373	Actually, it turns out that registering children of some
	374	platform_devices as more platform_devices is a common pattern, and the
	375	device tree support code reflects that and makes the above example
	376	simpler. The second argument to of_platform_populate() is an
	377	of_device_id table, and any node that matches an entry in that table
	378	will also get its child nodes registered. In the tegra case, the code
	379	can look something like this:
	380
	381	static void __init harmony_init_machine(void)
	382	{
	383	/* ... */
	384	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
	385	}
	386
	387	"simple-bus" is defined in the ePAPR 1.0 specification as a property
	388	meaning a simple memory mapped bus, so the of_platform_populate() code
	389	could be written to just assume simple-bus compatible nodes will
	390	always be traversed. However, we pass it in as an argument so that
	391	board support code can always override the default behaviour.
	392
	393	[Need to add discussion of adding i2c/spi/etc child devices]
	394
	395	Appendix A: AMBA devices
	396	------------------------
	397
	398	ARM Primecells are a certain kind of device attached to the ARM AMBA
	399	bus which include some support for hardware detection and power
	400	management. In Linux, struct amba_device and the amba_bus_type is
	401	used to represent Primecell devices. However, the fiddly bit is that
	402	not all devices on an AMBA bus are Primecells, and for Linux it is
	403	typical for both amba_device and platform_device instances to be
	404	siblings of the same bus segment.
	405
	406	When using the DT, this creates problems for of_platform_populate()
	407	because it must decide whether to register each node as either a
	408	platform_device or an amba_device. This unfortunately complicates the
	409	device creation model a little bit, but the solution turns out not to
	410	be too invasive. If a node is compatible with "arm,amba-primecell", then
	411	of_platform_populate() will register it as an amba_device instead of a
	412	platform_device.