Migrating to Community-driven Infrastructure
|Line 805:||Line 805:|
* OBS @ TiZen or SuSe : https://bugs.tizen.org/jira/browse/TINF-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
* OBS @ TiZen or SuSe : https://bugs.tizen.org/jira/browse/TINF-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
Revision as of 15:23, 27 March 2013
[up to date as of 2013-02-08] Albeit Nokia's plans about discontinuation of maemo support been known since spring 2012, Nokia gave "Go" to Nemein (service provider on behalf of Nokia) for the real migration work not earlier than 2 weeks before christmas 2012.
As of January, 18th 2013 the *.maemo.org infrastructure has been consolidated from a 20+ physical servers (aka "irons") to current config and completely migrated to new locations independant of Nokia servers. This task been accomplished by Nemein. Talk.maemo.org forum been integrated with the other infra, many thanks to Nemein for donating the VM for that. Also many thanks to Nemein for this incredible piece of work done during times when others (as well as the guys there) usually are already away for winter holidays.
The current setup (see below) consists of around 10 Virtual Machines hosted by Nemein on their xen-grid. This is an interim solution. Nokia paid Nemein for this consolidation/migration and hosting until end of February.
Handing over control of servers still pending, right now (2013-01-30) it's still Nemein and affiliates to control that infra.
Transfer of control over (*.)maemo.org DNS entries ("the domain") is still being negotiated between Nokia and HiFo, all DNS changes done so far been done by Nokia dnsmaster on Nemein's request
The plans of council and HiFo board so far are: kindly ask Nemein to have *.maemo.org nicely bundled. We hope for this setup to be free of major known bugs (I.E. autobuilder working, repository working albeit maybe slow) when Nemein hands us the package.
[2013-02-08] Negotiations about direct migration to one of our 3 options (see below) are ongoing.
further plans, state of migration
Further plans are to migrate again to some hosted root servers, either on a sponsor like http://osuosl.org/about-osuosl or to our own stuff we may rent from e.g. Hetzner.
[2013-02-08] currently we're in negotiations about 3 possible ways into future hosting:
- osuosl (could provide VM or rootservers or CoLo [UPS server shipping: 48h:1200EUR, 7d:630EUR, +customs])
- IPHH, a ISP in Hamburg. Falk contacted them and they are willing to offer CoLo basically free of charge. Of course we will put their name on our maemo.org frontpage to give due credit. HW service will be done by Falk. (costs ~300EUR for setup and HW upgrade, plus 50..300EUR for shipping the iron to Hamburg)
- get own paid rootservers, like 2 of http://www.hetzner.de/en/hosting/produkte_rootserver/ex10 (costs ~300EUR/month, 400EUR setup)
Depending on option chosen, we might or might not keep the SuperMicro.
[2013-02-17] Hildon Foundation board has agreed on following IPHH if the contract is good and keeping OSUOSL open as an alternative. Sending iron to IPHH on 2013-02-18/19, and also negotiating with OSUOSL about how a possible migration to them would look like so we get a decent checklist in case we need it.
Falk's mail forwarded form IPHH to HiFo:
Hi everyone, these are the details, what IPHH is willing to offer us. Best regards, Falk Begin forwarded message: > From: Rene Sasse <email@example.com> > Subject: [IPHH #442659] Re: maemo.org > Date: 18. Februar 2013 11:08:33 MEZ > To: firstname.lastname@example.org > Cc: email@example.com > Reply-To: firstname.lastname@example.org > > Falk, > > IPHH offers the following services to Hildon Foundation for one year free of > charge: > > * Colocation/electricity for the following devices: > - 1 Server (2RU) > - 1 Switch (1RU) > * 1 100MBit/s Uplink Port > * A /27 IPv4 Network > > This offer is valid for one year and has to be discussed for renewal after 11 > month. > > Legal Contact will be: > > Hildon Foundation > 120 West 10th Street, Erie, PA, 16501, USA > > Technical Contact will be: > > Falk Stern (FS7182-RIPE) > Rathmann-Cohrs-Straße 12, 21357 Bardowick, Germany > Mobile: +49-160-71560xx > > > best regards > Rene > > -- > Rene Sasse E-Mail: email@example.com > Technical Consultant Tel: +49 (0)40 374919-xx > IPHH Internet Port Hamburg GmbH Fax: +49 (0)40 374919-xx > Wendenstrasse 408 AG Hamburg, HRB 76071 > D-20537 Hamburg Geschaeftsfuehrung: Axel G. Kroeger--7E94C7404EC25FD69CC85C3653348297 >
Iron to move form: ( http://nemein.com/fi/ )
Nemein Oy tel. +358 20-198 6030 Vilhonvuorenkatu 11 D, 8 krs 00500 Helsinki, FINLAND FIN-1647219-2 support AT nemein.com
IPHH Internet Port Hamburg GmbH #444615 Wendenstrasse 408 20537 Hamburg Germany T : +49 40 37 49 19-0 F : +49 40 37 49 19-29 E : firstname.lastname@example.org
size x: 100cm y: 66cm z: 28cm weight: ~40kg
Shipment number 1139212793 Status from Wed, 20.02.2013 10:57 hours Delivered - signed for by Herr POLROK* Recipient TPHH Delivered on Herr POLROK*
via DHL account provided by Nokia/Pekka (many thanks!) on 2013-02-19. Courtesy Aslan and Eero of Nemein.
Hosting migration timing plan:
Alternatives - however obvious - for the above plans have been discussed with Nemein and HiFo and are not feasible. E.G. there was no way we could get the money instead of the server iron hardware. Sustaining the current xen-grid based VM hosting would be ~1500EUR per month plus a basically not evadable 2200EUR on top for maintenace. We want to switch away from that by all means, thus the 2nd migration.
This page is intended as a central place where status and other operational information can be gathered.
Plan for migration / Timeline [2013-03-15]
- Friday, 22.2. (falk)
- Rack Hardware @ IPHH - Hardware is racked
- Install base system (CentOS 6.3 with patches from xes)
- Saturday, 23.2. (xes/falk)
- Start migrating repository.m.o
- Start migrating VMs with static data
- ... (hidden DNS master set up)
- sync databases, switch DNS entries
- DNS switched [Nokia] to new IPs on 2013-03-14 1700UTC. Final sync established 1900. since then machines up and running on *new*
VMs we need to migrate:
|Name||Disk Size||Location of act. instance||_migrated?||_Comments on *new* instance||static||30G||nemein||synced+up||works||wiki||20G||nemein||synced+up||works||repository||900G||nemein||synced+up||We need to check the disk size, this might be too big for current hw, maybe split tablets-dev off.||20G||nemein||synced+up||also has lists||scratchbox||100G||iphh||setup!||will be setup new||vcs||50G||nemein||synced+up||has NFS mounts from garage and repository (copying)||garage||100G||nemein||synced+up||has NFS mounts from stage and vcs (copied, seems to work)||db||100G||nemein||synced+up||works, needs tuning||builder||50G||nemein||copied+up||still needs fixing several aspects||talk||20G||nemein||synced+up||up since 2013-03-13, via HTTP-forward||dns||??||ipph||setup!||dns records/serial incomplete, bind inactive|
State of final migration
- talk.maemo.org is running on "new" hardware
- DNS switchover should happened at 14.3., around 17:00UTC
- all VM got synced and are up and running. *old* should be out of business when DNS change propagated
Setup with IPHH
We have 2 /28 Subnets (22.214.171.124/28 and 126.96.36.199/28)
Networks are configured as follows:
|IPv4||IPv6||VLAN||Xen Bridge||default GW||188.8.131.52/28||not yet||1||xenbr0||184.108.40.206||220.127.116.11/28||not yet||2||xenbr1||18.104.22.168||10.0.1.0/24||not yet||3||xenbr2||10.0.1.1|
IP Plan for vlan 1
|IPv4||IPv6||Hostname||22.214.171.124||n/a||firewall-carp||126.96.36.199||n/a||firewall-a||188.8.131.52||n/a||firewall-b||184.108.40.206||n/a||blade-a||220.127.116.11||n/a||blade-b||18.104.22.168||n/a||portforwarding for monitor||22.214.171.124||n/a||126.96.36.199||n/a||188.8.131.52||n/a||184.108.40.206||n/a||220.127.116.11||n/a||18.104.22.168||n/a||IPHH Router 1||22.214.171.124||n/a||IPHH Router 2||126.96.36.199||n/a||IPHH-VRRP|
IP Plan for vlan 2
|IPv4||IPv6||Hostname||Aliases||188.8.131.52||n/a||firewall-carp||-||184.108.40.206||n/a||firewall-a||-||220.127.116.11||n/a||firewall-b||-||18.104.22.168||n/a||www||static, maemo.org, planet, downloads||22.214.171.124||n/a||wiki||bugs||126.96.36.199||n/a||repository||stage||188.8.131.52||n/a||lists||184.108.40.206||n/a||scratchbox||-||220.127.116.11||n/a||vcs||drop||18.104.22.168||n/a||garage||-||22.214.171.124||n/a||builder||-||126.96.36.199||n/a||talk||-||188.8.131.52||n/a||DNS||-||184.108.40.206||n/a||-||-|
IP Plan for vlan 3
Disk Layout of blade-[ab]
Both disks have the following partitioning:
RAID1 Volume for /boot (/dev/md0), consisting of /dev/sda1 and /dev/sdb1 (200M)
RAID1 Volume /dev/md1 consisting of /dev/sda2 and /dev/sdb2 (around 970G) The RAID1 Volume contains a physical LVM volume. We only have one VolumeGroup (vg_blade[ab]), which has LogVol00 with 20G as root volume, LogVol01 with 2 Gig as swap and vmstore with the rest as VM Storage mounted on /vmstore.
Tips & Tricks for migration
Create an image on vmhost
fallocate -l 200g image.img
or, in case fallocate is unavailable
dd if=/dev/zero of=image.img bs=1 count=1 seek=200G
Attach as loop-device
losetup -f image.img(find the loop-device and create a filesystem on it)
tar --create -p -j --one-file-system . | pv -br | ssh root@host 'cd /mountpoint ; tar xpj 'or
cd / ; rsync -arvSxz . root@host:/mount/point
Stuff to do [2013-03-15]
- Implement a proper service monitoring for all machines and applications - nagios pending, http://monitor.maemo.org/ganglia/
- Setup a common policy for root/user accounts and sudo permissions
- Change root-passwords - done
- Make SSH root-login key-only - done?
- Find out, what to sync for final migration - done
- Configure internal DNS server in /etc/resolv.conf
- Coordinate DNS setup with Nokia - partially done
- Consolidate Databases - WIP
- Add disks to system - done, 4TB on blade-a
- Setup bugtracking system for infrastructure - done: roundup?
- fix NFS mounts - WIP
- update VMs to 3.2.0-38
Problems we walked into
Machines throwing their network away
Apparently, XEN has issues if a vm sends too many/too large network packets.
http://lists.xen.org/archives/html/xen-devel/2013-01/msg00198.html has an interesting read about that problem.
xenbr1: port 8(vif51.0) entered forwarding state vif vif-51-0 vif51.0: Too many frags vif vif-51-0 vif51.0: fatal error; disabling device xenbr1: port 8(vif51.0) entered disabled state
Temporary fix: Disable all offloading on eth0
for i in rx tx sg tso gso gro lro; do ethtool -K eth0 $i off done
Source of this problem:
We fixed that problem on our machines by ensuring dom0 and domU use same MAX_SKB_FRAGS
Inventory (obsolete, please update)
As a first step we try to gather information about the present infrastructure at *.maemo.org. This "inventory" is intended to provide an overview about all components of the infrastructure as well as to provide information that will later on aid during the actual migration.
Currently the following topics are considered important for the migration:
- Legal Issues (Names, Trademarks, Domain Names, etc.)
- Infrastructure (Web Site, Forum, Wiki, Autobuilder, Mailinglists, Garage, etc.)
What is the state about the name "Maemo"?
"... Maemo is currently a registered trademark of Nokia and the domain name is owned by Nokia.
Who owns "maemo.org"?
Negotiations about domain ownership still ongoing between Hildon Foundation board and Nokia (2013-01-20), if community can't get control over the DNS, we might revert to maemocommunity.org.
Created On:07-Feb-2005 16:26:32 UTC
Last Updated On:07-Jan-2013 10:25:55 UTC
Expiration Date:07-Feb-2014 16:26:32 UTC
Sponsoring Registrar:MarkMonitor Inc. (R37-LROR)
Registrant Name:Nokia Corporation
Registrant Organization:Nokia Corporation
Registrant Street1:P.O.Box 226
Registrant Street2:Nokia Group
Registrant Postal Code:00045
We're planning to ask Nokia to allow a hidden primary  for maemo.org, that we will host on a persistent VM (dns) sponsored by Nemein (thanks Eero! :-D ). The purpose is to allow swift changes of IPs under maemo.org without bothering Nokia's DNSmaster, as long as the domain still belongs to Nokia. Once the domain will get transferred to HiFo, this will become less useful but also not exactly any problem. in 6 months or so we can consider tearing down the hidden primary and manage our domain directly.
What is needed for the community to run maemo.org?
TMO forums donated to Hildon Foundation: http://maemo.org/community/board/tmo_forums_donated_to_hildon_foundation/
What are the costs?
Nokia paid for hosting until end of February. Current (2013-01-30) interim config (VM on Nemein's xen-grid) will cost 1300EUR/month for the VM, plus 2200EUR/month for the maintenance. For the colocation rackspace, traffic, energy etc of the iron(s) Nokia donates to community there will be another 500+EUR/month. All excl VAT.
At end of February we hope to drop the xen-grid VM since they shall run in a virtualization on our iron by then.
If you're willing to donate, please visit http://hildonfoundation.org/support/
What about the personal information of the users?
[2013-03-20] All of maemo.org is running on our supermicro server colocated at IPHH
List of hardware Nokia will donate to HiFo, according to Nemein's plans. [2013-02-08]
|ID||Hostname||Mgmt IP Address||OOB Mgmt IP Address||Type (Virtual / Baremetal)||System Admin||HW Vendor||HW Model||Form Factor||CPU||Memory||Disk||Acquisition Date||Warranty||Services||Comment|
|01||blade-a.maemo.org||Baremetal||Falk(warfare)||Supermicro||http://www.supermicro.nl/products/system/2u/2027/SYS-2027TR-HTRF.cfm?parts=SHOW||2U 19" Rackmount||Intel® Xeon® processor E5-2620||32GB||(raid1:2*)1TB, 2*2TB=4TB aux.||3 years||Falk (for HH CoLo)||only 2 of the 4 blades populated|
|02||blade-b.maemo.org||Baremetal||Intel® Xeon® processor E5-2620||32GB||(raid1:2*)1TB|
OS and virtulization on community iron (planning, discussion)
Please don't forget to tag your contributions with your nick!
XEN (with OS blabla of above)
The following table is intended to give a concise and easily perceivable overview of the *.maemo.org services. Please use the next sub-section for providing more detailed information.
|Resource||URL (If Applicable)||Migration Status (DONE/WIP/NST)||Service Maintainer||System Admin||Software Name||Software Version||Software License||Known Issues||Last status update|
|Maemo Main Web Site||http://www.maemo.org||BUGS||?||Nemein||orphaned links/404s: http://maemo.org/community/council/system_operator_needed/; Login doesn't work||2013-01-25|
|Maemo Forums||http://talk.maemo.org||DONE||chemist, Reggie||Falk, chemist||vBulletin||Unlimited duration, no upgrades included, acquired on 2012-20-12||Captcha image issues||2013-02-10|
|Maemo Wiki||http://wiki.maemo.org||BUGS||?||Nemein||(Watch) Email not working; random connection timeouts||2013-01-25|
|Repositories||http://repository.maemo.org||BUGS||X-Fade, Merlin1981||Nemein||former akamai serverfarm, now points to stage.m.o VM master of farm. Hashsum errors legacy||2013-02-20|
|Blog aggregator||http://planet.maemo.org||DONE||?||Nemein||login flawed?||2013-02-10|
|Maemo Garage||https://garage.maemo.org/||DONE||?, Woody||Nemein||2013-01-25|
|Maemo Autobuilder||NST||X-Fade||Nemein||OFFLINE, x-fade working on it||2013-02-20|
|Maemo Nameservers||WIP||Merlin, Falk||Nokia||Still using Nokia Nameservers; following hidden primary plan til domain transfer to HiFo established||2013-01-25|
|Listserv||https://lists.maemo.org||BUGS||Nemein||occasional lockups resp interface down||2013-02-20|
|Static||http://static.maemo.org||WIP||Nemein||temporary fix via NAT port81 redir, instable?||2013-02-20|
|Stage||http://stage.maemo.org||obsolete||X-Fade||Nemein||VM got assigned to repository.m.o||2013-02-20|
|Scratchbox||http://scratchbox.org/||WIP||thedead1440||Nemein, thedead1440||220.127.116.11, Logica Finland Oy, migration pending||2013-02-20|
More Detailed Information
In this sub section more detailed information about the entries in the table can be placed. The intent is to keep the table concise while still being able to have all relevant information at hand.
List of VMs and their associated IPs:
IP adresses 18.104.22.168 test.maemo.org # www.maemo.org maemo.org 22.214.171.124 www.maemo.org 126.96.36.199 planet.maemo.org 188.8.131.52 static.maemo.org 184.108.40.206 drop.maemo.org 220.127.116.11 garage.maemo.org 18.104.22.168 lists.maemo.org 22.214.171.124 wiki.maemo.org 126.96.36.199 bugs.maemo.org # 188.8.131.52 repository.maemo.org scrubbed 184.108.40.206 stage.maemo.org repository.maemo.org (reassigned) 220.127.116.11 vcs.maemo.org
List of internal IP/VM
127.0.0.1 MaemoTemplate 10.0.0.1 maemo static maintenance 10.0.0.2 wiki bugs 10.0.0.121 stage repository 10.0.0.4 mail smtp lists 10.0.0.5 scratchbox 10.0.0.6 dns #10.0.0.7 repository 10.0.0.9 vcs drop 10.0.0.10 garage 10.0.0.11 db backup 10.0.0.12 builder 10.0.0.254 fw
Cpu Cores, RAM (in MB), storage (DISK, in GB), of the VMs
Current VMs actually in use (some more were reserved originally since it was not certain what services could be merged) Name C RAM DISK ------------------------ MaemoFW 1 1024 10 Builder 1 4096 150 garage 2 8192 100 test 2 2048 30 wikib 2 2048 50 www 2 6144 70 vcs 2 8192 200 db 2 8192 260 mail 2 2048 30 stage 2 2048 870 talk 2 4096 15 ======================== 20 48128 1785 sb 2 2048 30 dns 2 2048 30 ======================== 25 52224 1845
Unlike the other services, talk.maemo.org is not behind the endian firewall. Maintenence access is not via test jumpserver.
Software: vBulletin licence: Unlimited duration, no upgrades included, acquired on 2012-20-12
Scratchbox is also sponsored by Nokia. (Please verify?) Scratchbox is required for running the Fremantle and Harmattan SDK.
Currently there's a VM on Nemein's xen-grid named "scratchbox", but state of the case is unclear.
Tracker for Sysops and Maintainers
This tracker is meant for maemo staff and affiliated only
web frontend: roundup.fourecks.de/maemo/ mail access (read docs!): maemo-issue AT fourecks.de
Service Maintainers (please update/augment/fix)
(please don't usually pester maintainers directly! First try to contact email@example.com, we'll forward)
These are the Service Maintainers (in spe), for services like forum (tmo), wiki, bugs, etc. They are (generally) not sysops of the machines their service is running on.
|From||Nick||Full Name||Services Maintained||Status||Comments||Nemein||mashiara||Rambo Eero af Heurlin||eero.afheurlin at <to be disclosed by owner>||(sysop)||[leaving?]||Nemein||x-fade||Niels Breet||Niels<at>maemo.org||(mail, IRC, builder, ???...)||[leaving?]||Nemein||ferenc||Ferenc Szekely||ferenc<at>maemo.org||(mail, sysop, ???...)||[leaving?]||maemo||warfare||Falk Stern||falk<at>fourecks.de||(maemo master sysop)||maemo||chemist||Ruediger Schiller||webmaster<at>talk.m.o||Talk||maemo||merlin1991||Christian Ratzenhofer||<at>||Repos||[preliminary accepted]||???||andre_||Andre Klapper||???<at>???||Bugs||[???]||??? (wiki)||(planet???)|
All legacy accounts got ported to new infra.
Access to any VM is via plain direct ssh:
we're doing backups to the 4TB auxiliary storage on blade-a, using backupPC:
ssh -L8088:localhost:80 blade-a konqueror http://localhost:8088
backup-master is Falk
talk VM sysop (chem|st) has access to it and control over own backups, via ssh confic on blade-a:
command="sleep 1d",permitopen="127.0.0.1:80" <ssh-pubkey>
council is in charge of any steering. Joerg Reisenweber got appointed for "maemo.org infra administration coordinator" and thus is the single point of coordination for any detail questions.
If you got any questions, suggestions, critics, whatever, please contact Joerg (DocScrutinizer) or any other of council members via IRC. or send a mail to council AT maemo.org. We're just community's proxies acting in best intention to do what's probably community's best interest. If you don't agree with what we do or have suggestions how we could do better, please holler. Best place: Friday 1800UTC IRC:(freenode.net)#maemo-meeting
- OBS @ TiZen or SuSe : https://bugs.tizen.org/jira/browse/TINF-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
Autobuilder and friends
maemo autobuilder setup
autobuilder consists of multiple VMs
drop VM this VM has /etc/passwd synchronised with garage and ~ folders mounted via NFS from garage account synchronisation is handled by scripts running on garage VM and then sync is triggered using ssh and scripts in /usr/local/bin packages are uploaded to /mnt/incoming-builder via SCP
garage VM this is the VM where stuff happens password/account sync to gforge/postgresql is done using
- /10 * * * * root /usr/local/bin/add_groups_users_git_ssh.sh > /tmp/add_groups_users_git_ssh.log dev/null 2>&1
this also updates ~/.ssh/authorized_keys
garage also handles web extras-uploader (/var/lib/extras-assistant/) - package is uploaded and then moved to the same folder as packages uploaded to drop and then chowned using
A lot of jobs on garage VM is done using local root crontab (/var/spool/cron/crontabs/root)
after package is uploaded it's processed by buildME
buildME runs as builder user and it's started from cron every minute
- * * * * builder /home/builder/buildme
buildme is configured using /etc/buildme.conf
buildme takes care of couple things verify that .tar.gz and other files are correct (checked using checksum from .dsc file) select free destination (buildme can handle parallel builds on multiple hosts/users) scp all required files to selected destination start sbdmock on the destination copy results back and resulting .deb to repository incoming folder (result_dir = /mnt/builder/%(product)s and repo_queue = /mnt/incoming/extras-devel/%(product)s/) send emails to list and user uploading package
this VM has standard installation of scratchbox with no targets configured (it's not required for sbdmock)
when sbdmock is started it cleans up old build folder, creates new target and prepares build enviroment and then runs dpkg-buildpackage sbdmock also generates logfiles that are parsed by buildme
this is where repository management happens
- /2 * * * * repository /home/repository/queue-manage-extras-devel.sh
- /5 * * * * repository /home/repository/queue-manage-extras.sh
- /5 * * * * repository /home/repository/queue-manage-community-testing.sh
- /5 * * * * repository /home/repository/queue-manage-community.sh
those scripts (and scripts inside /home/repository/queue-manager-extras) check for new packages in repository incoming folder and then move those to /var/repository/staging, regenerate Packages
(using sums that were previously cached) and sign it if required and then if any changes happened
- touch .changed file, so we know that we need to sync to live
this file is then checked by 1003 10634 1 0 Mar18 ? 00:00:00 /bin/sh /usr/local/bin/packages/rqp.sh started by /etc/init.d/repository-qp this script starts rsync when required to sync to live repository this script also starts repository-queue-proc.php that processes repository updates coming from midgard (old package cleanup and promotions)