1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<chapter>
<title>File Download Support</title>
<para>
BitBake's fetch module is a standalone piece of library code
that deals with the intricacies of downloading source code
and files from remote systems.
Fetching source code is one of the corner stones of building software.
As such, this module forms an important part of BitBake.
</para>
<para>
The current fetch module is called "fetch2" and refers to the
fact that it is the second major version of the API.
The original version is obsolete and removed from the codebase.
Thus, in all cases, "fetch" refers to "fetch2" in this
manual.
</para>
<section id='the-download-fetch'>
<title>The Download (Fetch)</title>
<para>
BitBake takes several steps when fetching source code or files.
The fetcher codebase deals with two distinct processes in order:
obtaining the files from somewhere (cached or otherwise)
and then unpacking those files into a specific location and
perhaps in a specific way.
Getting and unpacking the files is often optionally followed
by patching.
Patching, however, is not covered by this module.
</para>
<para>
The code to execute the first part of this process, a fetch,
looks something like the following:
<literallayout class='monospaced'>
src_uri = (d.getVar('SRC_URI', True) or "").split()
fetcher = bb.fetch2.Fetch(src_uri, d)
fetcher.download()
</literallayout>
This code sets up an instance of the fetch class.
The instance uses a space-separated list of URLs from the
<link linkend='var-SRC_URI'><filename>SRC_URI</filename></link>
variable and then calls the <filename>download</filename>
method to download the files.
</para>
<para>
The instantiation of the fetch class is usually followed by:
<literallayout class='monospaced'>
rootdir = l.getVar('WORKDIR', True)
fetcher.unpack(rootdir)
</literallayout>
This code unpacks the downloaded files to the
specified by <filename>WORKDIR</filename>.
<note>
For convenience, the naming in these examples matches
the variables used by OpenEmbedded.
</note>
The <filename>SRC_URI</filename> and <filename>WORKDIR</filename>
variables are not coded into the fetcher.
They variables can (and are) called with different variable names.
In OpenEmbedded for example, the shared state (sstate) code uses
the fetch module to fetch the sstate files.
</para>
<para>
When the <filename>download()</filename> method is called,
BitBake tries to fulfill the URLs by looking for source files
in a specific search order:
<itemizedlist>
<listitem><para><emphasis>Pre-mirror Sites:</emphasis>
BitBake first uses pre-mirrors to try and find source files.
These locations are defined using the
<link linkend='var-PREMIRRORS'><filename>PREMIRRORS</filename></link>
variable.
</para></listitem>
<listitem><para><emphasis>Source URI:</emphasis>
If pre-mirrors fail, BitBake uses the original URL (e.g from
<filename>SRC_URI</filename>).
</para></listitem>
<listitem><para><emphasis>Mirror Sites:</emphasis>
If fetch failures occur, BitBake next uses mirror location as
defined by the
<link linkend='var-MIRRORS'><filename>MIRRORS</filename></link>
variable.
</para></listitem>
</itemizedlist>
</para>
<para>
For each URL passed to the fetcher, the fetcher
calls the submodule that handles that particular URL type.
This behavior can be the source of some confusion when you
are providing URLs for the <filename>SRC_URI</filename>
variable.
Consider the following two URLs:
<literallayout class='monospaced'>
http://git.yoctoproject.org/git/poky;protocol=git
git://git.yoctoproject.org/git/poky;protocol=http
</literallayout>
In the former case, the URL is passed to the
<filename>wget</filename> fetcher, which does not
understand "git".
Therefore, the latter case is the correct form since the
Git fetcher does know how to use HTTP as a transport.
</para>
<para>
Here are some examples that show commonly used mirror
definitions:
<literallayout class='monospaced'>
PREMIRRORS ?= "\
bzr://.*/.* http://somemirror.org/sources/ \n \
cvs://.*/.* http://somemirror.org/sources/ \n \
git://.*/.* http://somemirror.org/sources/ \n \
hg://.*/.* http://somemirror.org/sources/ \n \
osc://.*/.* http://somemirror.org/sources/ \n \
p4://.*/.* http://somemirror.org/sources/ \n \
svn://.*/.* http://somemirror.org/sources/ \n"
MIRRORS =+ "\
ftp://.*/.* http://somemirror.org/sources/ \n \
http://.*/.* http://somemirror.org/sources/ \n \
https://.*/.* http://somemirror.org/sources/ \n"
</literallayout>
It is useful to note that BitBake supports
cross-URLs.
It is possible to mirror a Git repository on an HTTP
server as a tarball.
This is what the <filename>git://</filename> mapping in
the previous example does.
</para>
<para>
Since network accesses are slow, Bitbake maintains a
cache of files downloaded from the network.
Any source files that are not local (i.e.
downloaded from the Internet) are placed into the download
directory, which is specified by the
<link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
variable.
</para>
<para>
File integrity is of key importance for reproducing builds.
For non-local archive downloads, the fetcher code can verify
sha256 and md5 checksums to ensure the archives have been
downloaded correctly.
You can specify these checksums by using the
<filename>SRC_URI</filename> variable with the appropriate
varflags as follows:
<literallayout class='monospaced'>
SRC_URI[md5sum] = "value"
SRC_URI[sha256sum] = "value"
</literallayout>
You can also specify the checksums as parameters on the
<filename>SRC_URI</filename> as shown below:
<literallayout class='monospaced'>
SRC_URI = "http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07fdb994d"
</literallayout>
If multiple URIs exist, you can specify the checksums either
directly as in the previous example, or you can name the URLs.
The following syntax shows how you name the URIs:
<literallayout class='monospaced'>
SRC_URI = "http://example.com/foobar.tar.bz2;name=foo"
SRC_URI[foo.md5sum] = 4a8e0f237e961fd7785d19d07fdb994d
</literallayout>
After a file has been downloaded and has had its checksum checked,
a ".done" stamp is placed in <filename>DL_DIR</filename>.
BitBake uses this stamp during subsequent builds to avoid
downloading or comparing a checksum for the file again.
<note>
It is assumed that local storage is safe from data corruption.
If this were not the case, there would be bigger issues to worry about.
</note>
</para>
<para>
If
<link linkend='var-BB_STRICT_CHECKSUM'><filename>BB_STRICT_CHECKSUM</filename></link>
is set, any download without a checksum triggers an
error message.
The
<link linkend='var-BB_NO_NETWORK'><filename>BB_NO_NETWORK</filename></link>
variable can be used to make any attempted network access a fatal
error, which is useful for checking that mirrors are complete
as well as other things.
</para>
</section>
<section id='bb-the-unpack'>
<title>The Unpack</title>
<para>
The unpack process usually immediately follows the download.
For all URLs except Git URLs, BitBake uses the common
<filename>unpack</filename> method.
</para>
<para>
A number of parameters exist that you can specify within the
URL to govern the behavior of the unpack stage:
<itemizedlist>
<listitem><para><emphasis>unpack:</emphasis>
Controls whether the URL components are unpacked.
If set to "1", which is the default, the components
are unpacked.
If set to "0", the unpack stage leaves the file alone.
This parameter is useful when you want an archive to be
copied in and not be unpacked.
</para></listitem>
<listitem><para><emphasis>dos:</emphasis>
Applies to <filename>.zip</filename> and
<filename>.jar</filename> files and specifies whether to
use DOS line ending conversion on text files.
</para></listitem>
<listitem><para><emphasis>basepath:</emphasis>
Instructs the unpack stage to strip the specified
directories from the source path when unpacking.
</para></listitem>
<listitem><para><emphasis>subdir:</emphasis>
Unpacks the specific URL to the specified subdirectory
within the root directory.
</para></listitem>
</itemizedlist>
The unpack call automatically decompresses and extracts files
with ".Z", ".z", ".gz", ".xz", ".zip", ".jar", ".ipk", ".rpm".
".srpm", ".deb" and ".bz2" extensions as well as various combinations
of tarball extensions.
</para>
<para>
As mentioned, the Git fetcher has its own unpack method that
is optimized to work with Git trees.
Basically, this method works by cloning the tree into the final
directory.
The process is completed using references so that there is
only one central copy of the Git metadata needed.
</para>
</section>
<section id='bb-fetchers'>
<title>Fetchers</title>
<para>
As mentioned earlier, the URL prefix determines which
fetcher submodule BitBake uses.
Each submodule can support different URL parameters,
which are described in the following sections.
</para>
<section id='local-file-fetcher'>
<title>Local file fetcher (<filename>file://</filename>)</title>
<para>
This submodule handles URLs that begin with
<filename>file://</filename>.
The filename you specify with in the URL can
either be an absolute or relative path to a file.
If the filename is relative, the contents of the
<link linkend='var-FILESPATH'><filename>FILESPATH</filename></link>
variable is used in the same way
<filename>PATH</filename> is used to find executables.
Failing that,
<link linkend='var-FILESDIR'><filename>FILESDIR</filename></link>
is used to find the appropriate relative file.
<note>
<filename>FILESDIR</filename> is deprecated and can
be replaced with <filename>FILESPATH</filename>.
Because <filename>FILESDIR</filename> is likely to be
removed, you should not use this variable in any new code.
</note>
If the file cannot be found, it is assumed that it is available in
<link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
by the time the <filename>download()</filename> method is called.
</para>
<para>
If you specify a directory, the entire directory is
unpacked.
</para>
<para>
Here are some example URLs:
<literallayout class='monospaced'>
SRC_URI = "file://relativefile.patch"
SRC_URI = "file://relativefile.patch;this=ignored"
SRC_URI = "file:///Users/ich/very_important_software"
</literallayout>
</para>
</section>
<section id='cvs-fetcher'>
<title>CVS fetcher (<filename>(cvs://</filename>)</title>
<para>
This submodule handles checking out files from the
CVS version control system.
You can configure it using a number of different variables:
<itemizedlist>
<listitem><para><emphasis><filename>FETCHCMD_cvs</filename>:</emphasis>
The name of the executable to use when running
the <filename>cvs</filename> command.
This name is usually "cvs".
</para></listitem>
<listitem><para><emphasis><filename>SRCDATE</filename>:</emphasis>
The date to use when fetching the CVS source code.
A special value of "now" causes the checkout to
be updated on every build.
</para></listitem>
<listitem><para><emphasis><filename>CVSDIR</filename>:</emphasis>
Specifies where a temporary checkout is saved.
The location is often <filename>DL_DIR/cvs</filename>.
</para></listitem>
<listitem><para><emphasis><filename>CVS_PROXY_HOST</filename>:</emphasis>
The name to use as a "proxy=" parameter to the
<filename>cvs</filename> command.
</para></listitem>
<listitem><para><emphasis><filename>CVS_PROXY_PORT</filename>:</emphasis>
The port number to use as a "proxyport=" parameter to
the <filename>cvs</filename> command.
</para></listitem>
</itemizedlist>
As well as the standard username and password URL syntax,
you can also configure the fetcher with various URL parameters:
</para>
<para>
The supported parameters are as follows:
<itemizedlist>
<listitem><para><emphasis>"method":</emphasis>
The protocol over which to communicate with the cvs server.
By default, this protocol is "pserver".
If "method" is set to "ext", BitBake examines the
"rsh" parameter and sets <filename>CVS_RSH</filename>.
You can use "dir" for local directories.
</para></listitem>
<listitem><para><emphasis>"module":</emphasis>
Specifies the module to check out.
You must supply this parameter.
</para></listitem>
<listitem><para><emphasis>"tag":</emphasis>
Describes which CVS TAG should be used for
the checkout.
By default, the TAG is empty.
</para></listitem>
<listitem><para><emphasis>"date":</emphasis>
Specifies a date.
If no "date" is specified, the
<link linkend='var-SRCDATE'><filename>SRCDATE</filename></link>
of the configuration is used to checkout a specific date.
The special value of "now" causes the checkout to be
updated on every build.
</para></listitem>
<listitem><para><emphasis>"localdir":</emphasis>
Used to rename the module.
Effectively, you are renaming the output directory
to which the module is unpacked.
You are forcing the module into a special
directory relative to <filename>CVSDIR</filename>.
</para></listitem>
<listitem><para><emphasis>"rsh"</emphasis>
Used in conjunction with the "method" parameter.
</para></listitem>
<listitem><para><emphasis>"scmdata":</emphasis>
Causes the CVS metadata to be maintained in the tarball
the fetcher creates when set to "keep".
The tarball is expanded into the work directory.
By default, the CVS metadata is removed.
</para></listitem>
<listitem><para><emphasis>"fullpath":</emphasis>
Controls whether the resulting checkout is at the
module level, which is the default, or is at deeper
paths.
</para></listitem>
<listitem><para><emphasis>"norecurse":</emphasis>
Causes the fetcher to only checkout the specified
directory with no recurse into any subdirectories.
</para></listitem>
<listitem><para><emphasis>"port":</emphasis>
The port to which the CVS server connects.
</para></listitem>
</itemizedlist>
Some example URLs are as follows:
<literallayout class='monospaced'>
SRC_URI = "cvs://CVSROOT;module=mymodule;tag=some-version;method=ext"
SRC_URI = "cvs://CVSROOT;module=mymodule;date=20060126;localdir=usethat"
</literallayout>
</para>
</section>
<section id='http-ftp-fetcher'>
<title>HTTP/FTP wget fetcher (<filename>http://</filename>, <filename>ftp://</filename>, <filename>https://</filename>)</title>
<para>
This fetcher obtains files from web and FTP servers.
Internally, the fetcher uses the wget utility.
</para>
<para>
The executable and parameters used are specified by the
<filename>FETCHCMD_wget</filename> variable, which defaults
to a sensible values.
The fetcher supports a parameter "downloadfilename" that
allows the name of the downloaded file to be specified.
Specifying the name of the downloaded file is useful
for avoiding collisions in
<link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
when dealing with multiple files that have the same name.
</para>
<para>
Some example URLs are as follows:
<literallayout class='monospaced'>
SRC_URI = "http://oe.handhelds.org/not_there.aac"
SRC_URI = "ftp://oe.handhelds.org/not_there_as_well.aac"
SRC_URI = "ftp://you@oe.handheld.sorg/home/you/secret.plan"
</literallayout>
</para>
</section>
<section id='svn-fetcher'>
<title>Subversion (SVN) Fetcher (<filename>svn://</filename>)</title>
<para>
This fetcher submodule fetches code from the
Subversion source control system.
The executable used is specified by
<filename>FETCHCMD_svn</filename>, which defaults
to "svn".
The fetcher's temporary working directory is set
by <filename>SVNDIR</filename>, which is usually
<filename>DL_DIR/svn</filename>.
</para>
<para>
The supported parameters are as follows:
<itemizedlist>
<listitem><para><emphasis>"module":</emphasis>
The name of the svn module to checkout.
You must provide this parameter.
You can think of this parameter as the top-level
directory of the repository data you want.
</para></listitem>
<listitem><para><emphasis>"protocol":</emphasis>
The protocol to use, which defaults to "svn".
Other options are "svn+ssh" and "rsh".
For "rsh", the "rsh" parameter is also used.
</para></listitem>
<listitem><para><emphasis>"rev":</emphasis>
The revision of the source code to checkout.
</para></listitem>
<listitem><para><emphasis>"date":</emphasis>
The date of the source code to checkout.
Specific revisions are generally much safer to checkout
rather than by date as they do not involve timezones
(e.g. they are much more deterministic).
</para></listitem>
<listitem><para><emphasis>"scmdata":</emphasis>
Causes the “.svn” directories to be available during
compile-time when set to "keep".
By default, these directories are removed.
</para></listitem>
</itemizedlist>
Following are two examples using svn:
<literallayout class='monospaced'>
SRC_URI = "svn://svn.oe.handhelds.org/svn;module=vip;proto=http;rev=667"
SRC_URI = "svn://svn.oe.handhelds.org/svn/;module=opie;proto=svn+ssh;date=20060126"
</literallayout>
</para>
</section>
<section id='git-fetcher'>
<title>GIT Fetcher (<filename>git://</filename>)</title>
<para>
This fetcher submodule fetches code from the Git
source control system.
The fetcher works by creating a bare clone of the
remote into <filename>GITDIR</filename>, which is
usually <filename>DL_DIR/git</filename>.
This bare clone is then cloned into the work directory during the
unpack stage when a specific tree is checked out.
This is done using alternates and by reference to
minimize the amount of duplicate data on the disk and
make the unpack process fast.
The executable used can be set with
<filename>FETCHCMD_git</filename>.
</para>
<para>
This fetcher supports the following parameters:
<itemizedlist>
<listitem><para><emphasis>"protocol":</emphasis>
The protocol used to fetch the files.
The default is "git" when a hostname is set.
If a hostname is not set, the Git protocol is "file".
You can also use "http", "https", "ssh" and "rsync".
</para></listitem>
<listitem><para><emphasis>"nocheckout":</emphasis>
Tells the fetcher to not checkout source code when
unpacking when set to "1".
Set this option for the URL where there is a custom
routine to checkout code.
The default is "0".
</para></listitem>
<listitem><para><emphasis>"rebaseable":</emphasis>
Indicates that the upstream Git repository can be rebased.
You should set this parameter to "1" if
revisions can become detached from branches.
In this case, the source mirror tarball is done per
revision, which has a loss of efficiency.
Rebasing the upstream Git repository could cause the
current revision to disappear from the upstream repository.
This option reminds the fetcher to preserve the local cache
carefully for future use.
The default value for this parameter is "0".
</para></listitem>
<listitem><para><emphasis>"nobranch":</emphasis>
Tells the fetcher to not check the SHA validation
for the branch when set to "1".
The default is "0".
Set this option for the recipe that refers to
the commit that is valid for a tag instead of
the branch.
</para></listitem>
<listitem><para><emphasis>"bareclone":</emphasis>
Tells the fetcher to clone a bare clone into the
destination directory without checking out a working tree.
Only the raw Git metadata is provided.
This parameter implies the "nocheckout" parameter as well.
</para></listitem>
<listitem><para><emphasis>"branch":</emphasis>
The branch(es) of the Git tree to clone.
If unset, this is assumed to be "master".
The number of branch parameters much match the number of
name parameters.
</para></listitem>
<listitem><para><emphasis>"rev":</emphasis>
The revision to use for the checkout.
The default is "master".
</para></listitem>
<listitem><para><emphasis>"tag":</emphasis>
Specifies a tag to use for the checkout.
To correctly resolve tags, BitBake must access the
network.
For that reason, tags are often not used.
As far as Git is concerned, the "tag" parameter behaves
effectively the same as the "revision" parameter.
</para></listitem>
<listitem><para><emphasis>"subpath":</emphasis>
Limits the checkout to a specific subpath of the tree.
By default, the whole tree is checked out.
</para></listitem>
<listitem><para><emphasis>"destsuffix":</emphasis>
The name of the path in which to place the checkout.
By default, the path is <filename>git/</filename>.
</para></listitem>
</itemizedlist>
Here are some example URLs:
<literallayout class='monospaced'>
SRC_URI = "git://git.oe.handhelds.org/git/vip.git;tag=version-1"
SRC_URI = "git://git.oe.handhelds.org/git/vip.git;protocol=http"
</literallayout>
</para>
</section>
<section id='other-fetchers'>
<title>Other Fetchers</title>
<para>
Fetch submodules also exist for the following:
<itemizedlist>
<listitem><para>
Bazaar (<filename>bzr://</filename>)
</para></listitem>
<listitem><para>
Perforce (<filename>p4://</filename>)
</para></listitem>
<listitem><para>
Git Submodules (<filename>gitsm://</filename>)
</para></listitem>
<listitem><para>
Trees using Git Annex (<filename>gitannex://</filename>)
</para></listitem>
<listitem><para>
Secure FTP (<filename>sftp://</filename>)
</para></listitem>
<listitem><para>
Secure Shell (<filename>ssh://</filename>)
</para></listitem>
<listitem><para>
Repo (<filename>repo://</filename>)
</para></listitem>
<listitem><para>
OSC (<filename>osc://</filename>)
</para></listitem>
<listitem><para>
Mercurial (<filename>hg://</filename>)
</para></listitem>
</itemizedlist>
No documentation currently exists for these lesser used
fetcher submodules.
However, you might find the code helpful and readable.
</para>
</section>
</section>
<section id='auto-revisions'>
<title>Auto Revisions</title>
<para>
We need to document <filename>AUTOREV</filename> and
<filename>SRCREV_FORMAT</filename> here.
</para>
</section>
</chapter>
|