Calendar Year 2017 Q4 OKRs
View GitLabs Objective-Key Results for quarter 4 2017. Learn more here!
Objective 1: Grow incremental ACV according to plan
- CRO: Sales efficiency ratio > 1.2 (IACV / sales+marketing)
- CRO: Field Sales efficiency ratio > 1.8 (IACV / sales spend)
- CMO: Marketing efficiency ratio > 3.4 (IACV / marketing spend) => 3.11
- CMO: Establish credibility and thought leadership with Enterprise Buyers delivering on pipeline generation plan through the development and activation of integrated marketing and sales development campaigns:
- CMO: Define category strategy, positioning and messaging and plan for activation across the company. => Done.
- CMO: Develop and document messaging framework => Draft completed.
- PMM: Develop and roll out updated pitch and analyst decks => Drafts completed.
- PMM: Develop Action Plan for Q1/Q2 activation of the strategy => Done.
- PMM: Develop GTM Strategy for EEU => Done.
- Enable Sales with decks and Early Adopter Program => Done.
- CMO: Continue website redesign iteration to support our awareness and lead generation objectives, accounting for distinct audiences. => IA work initiated. Multiple pages redesigned, /resources 75% complete and /kubernetes launched.
- MSD: achieve target in new inbound opportunity SQL $ => Achieved 90% of target.
- MSD: achieve target in new outbound opportunity SQL $ => Achieved 72% of target.
- MSD: achieve new opportunity volume target in strategic Sales segment accounts => Achieved 137% of target.
- CMO: Build out Product Marketing function, including hiring and on-boarding three people, updating objectives, process and handbook, and developing cross-functional alignment. => Partially completed with IC role. Not achieved on hiring additional roles.
- PMM: Evolve and deliver updated AE pitch deck messaging and incorporate Sales feedback => see individual items below
- PMM: Update current “toolchain DevOps” to EE pitch Deck => 85% complete.
- PMM: Create CE to EE “get a demo” Deck => complete.
- PMM: Create CE to EE Pitch Deck => Deferred to Q1.
- PMM: Create SVN to EE pitch Deck => Deferred to Q1.
- PMM: Develop a website Information Architecture (IA) to roll out in 2018 => 33% complete.
- PMM: Enhance ROI section of website including adding interstitial to promote individual calculators => Deferred to Q1.
- PMM: Work with content team to deliver 5 customer case studies or customer-centric blog posts. => 75%. 5 Case Studies in Queue. Customer interviews complete. Drafts in progress.
- PMM: Evolve EEP vs EES vs CE differentiation messaging and optimize website experience for product sections => Complete. Product pages updated.
- PMM: Evolve and deliver updated AE pitch deck messaging and incorporate Sales feedback => see individual items below
- CMO: Define category strategy, positioning and messaging and plan for activation across the company. => Done.
- CRO: 100% of new business IACV plan
- CRO: Sales Health 80% of Reps on Quota - missed
- CRO: Increase Average Sales Assisted New Business Deal by 30% - achieved
- RD: 70% of sales assisted deals are EEP
- CS: Launch Professional Services - 25% of all deals with subscription ACV of $100,000+ include PS
- CRO: Decrease Time to close (New Business - Sales Assisted) by 10% - missed
- Sales Ops: Launch MEDDPIC. Fields 100% filled out for all stage 3+ deals
- CRO: 165% net retention in Q4 renewal value - achieved
- Sales Ops: 95% of subscriptions under $1200 moved to auto-renew
- CS: 25% of large/strategic accounts on EES trialling EEP with how we expand game plan documented in SFDC
- CS: Double License Usage within our large and strategic accounts
- CS: Identity the trigger(s) to purchase for large/strategic accounts
- CS: Build Customer Advisory board program
- CFO: Efficiency
- Legal: Publish sales training guide on key legal terms common in deals
- Legal: Hold one sales team training session published on courses
- Legal: Implement a legal section for sales team on-boarding
- Legal: Use GitLab issues to track and collaborate legal matters
- Controller: Plan for implementing ASC 606 approved
- Controller: Move billing administration from SMB team to billing specialist
- Billing Specialist: Create a Zuora training module for quick and efficient training of new sales reps.
- Analytics: Create a user journey funnel
Objective 2: Popular next generation product
- CEO: Next generation
- VP Product
- Platform: Make way for a cloud IDE. Multi-file editor with terminal shipped.
- Discussion: Portfolio management: Epics and roadmap view shipped (as part of EEU)
- CI/CD: Improve support for Java development lifecycle. 1 more project done => Partially indirectly done: Public HTML artifacts
- CI/CD: Improve existing features with additional value. 1 feature extended. => Done: ex. Upload artifacts to Object Storage
- Prometheus: Make GitLab easier to monitor. GA Prometheus monitoring of the GitLab server, deprecate InfluxDB.
- Prometheus: Shift performance monitoring earlier in the pipeline, detecting regressions prior to merge. Deliver one feature.
- Prometheus: Complete the feedback loop by comparing Canary and Stable performance.
- VP Eng
- Generate joint PM/Eng plan to update issue taxonomy (labels) to aid prioritization and kickoff implementation => 30% done, MR started, need to socialize and implement next quarter
- Support Lead
- 85% Premium Support SLA (up from 68% last quarter) => 83%
- +5% MoM on all other SLAs => October: Average SLA: 52%. December Average SLA: 60%. 2% MoM Average improvement.
- One hire through active sourcing => Not hit, but Augie is sourcing EMEA Support Engineers for us!
- Support Blog post every other week to use as recruiting tool => Done!
- Design Lead
- Document GitLab UX standards / style guide and communicate to internal team via UX Group Conversation, the UX Style guide, and within issues => 90% completed, wrapping up design.gitlab next quarter
- Finish Phase 1 of a design library of assets => 88% Complete, will finish the last 3 assets in next quarter
- Frontend (AC) Lead
- Write 60 unit tests to resolve test debt in Q4 (evenly distributed throughout the team) => 602% of 120 for both teams (Overall 723 new tests)
- Crush 140 bugs this quarter (evenly distributed through the team) => 130% (366 Frontend Bugs closed)
- Improve codebase by making modules ready for webpack by moving it to our new coding standards (#38869) => 93% (Will be completed with 10.5)
- Improve performance by making library updates and webpack bundle optimizations (#39072) => 55%
- Finish conversion from inline icons to SVG Icons to improve performance => 80% done (Will be completed with 10.5)
- Frontend (DC) Lead
- Write 60 unit tests to resolve test debt in Q4 (evenly distributed throughout the team) => 602% of 120 (Overall 723 new tests)
- Crush 140 bugs this quarter (evenly distributed through the team) => 130% (366 Frontend Bugs closed)
- Refactor the MR discussion in Vue to decrease load times, and increase performance/usability => 70%, Will ship in 10.5
- Remove global namespaces, to enable webpack code splitting, which improves performance (#38869) => 93% (Will be completed with 10.5)
- Director of Backend
- Author a demo script for use throughout the quarter => 70% complete, script still varied week to week
- Expose EE and CE code coverage metrics => 100% complete, see https://gitlab-org.gitlab.io/gitlab-ce/coverage-ruby and https://gitlab-org.gitlab.io/gitlab-ee/coverage-ruby
- Assist 1 top tier customer switch to GitLab and ensure P1 bugs/issues get fixed => 80% complete, customer switch was successful, but some issues remain
- Distribution Lead
- Establish baseline metric for install time/ease and come up with a plan to achieve and maintain it => 10% complete
- Decrease build times from 60 minutes to 30 minutes => Done
- Create integration test for Mattermost => Done
- Platform Lead
- Identify 1 sub-standard area of the code base and raise local unit test coverage up to project level => No area was identified and no unit test coverage was raised
- Write integration test for backup/restore => Not scheduled, and not done
- Make GitLab QA test LDAP => Not scheduled, and not done
- Resolve or schedule all priority 1 & 2 Platform issues (and groom performance issues) => Out of the 23 AP1, AP2, SP1, SP2, SL1 and SL2 issues that existed at the beginning of the quarter, 12 are resolved, 7 are scheduled, and 4 are current unscheduled. 19/23 resolved or scheduled is 83% done!
- CI/CD Lead
- Add 1 integration for runners: done,
- Resolve or schedule all priority 1 & 2 CI/CD issues (and groom performance issues) => 33% resolved (6/16 for P1, 8/26 for P2), 19% scheduled (5/10 for P1, 3/18 for P2)
- Reduce amount of system failures to less than 0.1% => As of 5th Dec it was 0.29% [Failure rate],(https://gitlab.com/gitlab-com/infrastructure/issues/3349)
- Improve cost efficiency of CI jobs processing for GitLab.com and GitLab Inc. => we process all jobs on DO and Google, since Google billing is more favorable we are more cost effective
- Discussion Lead
- Write integration test for squash/rebase => MR in progress, but not merged: gitlab-org/gitlab-ce!15964. Blocked by the branches MR.
- Write integration test for protected branches => MR in progress, but not merged: gitlab-org/gitlab-ce!15627 / gitlab-org/gitlab-ce!15626. Blocked by object storage work.
- Resolve or schedule all Priority 1 & 2 Discussion issues (and groom performance issues) => All AP1, AP2, SL1, SL2, and SP1 issues are either scheduled or done. However, we did not address the SP2 issues, many of which are feature proposals, not bugs. By category, let’s call that 5/6 addressed, or 83% complete.
- Prometheus Lead
- Reach parity with Prometheus metrics for Unicorn, Sidekiq, and gitlab-shell and Deprecate InfluxDB => 70% complete, Unicorn and Sidekiq metrics shipped, gitlab-shell and Deprecate InfluxDB moved to Q1.
- Make Grafana dashboards available for all Prometheus data easy to install for GitLab instances => 50% complete, Dashboards created, but need polish and documentation.
- Identify 1 sub-standard area of the code base and raise local unit test coverage up to project level => Done, prometheus-client-mmap now has better testing and coverage.
- Resolve or schedule all Priority 1 & 2 Prometheus issues (and groom performance issues) => Done, prometheus-client-mmap performance improvements shipped.
- Geo Team
- Make Geo Generally Available => Done in 10.2
- Geo performant at GitLab.com scale => 30% done: a full HA testbed was built, but it hasn’t been pushed to the limits yet; marking as only 30% since there may be more unknown unknowns to deal with here.
- Manual failover robust in Geo as first step to Disaster Recovery => 80% done: manual failover demo-ed and documented, but still incomplete.
- Director of Quality
- Document what quality means at GitLab on an about page => Done. See High Level Goals here
- Communicate standard in 3 different ways to internal team => 33% Complete. Only covered in handbook.
- Make issue/board scheme change recommendation to allow us to better mine backlog for quality metrics => 0% complete. Not started due to other priorities
- Initiate a project to make quality metrics and charts self-service => Done. See gitlab-insights project
- Initiate a project to allow for UI testing of the web application locally and on CI => Done. GitLab QA does this, however the work to make it production ready continues
- Edge Lead
- Ship large database seeder for developers => 0% complete. Not started due to other priorities.
- Enable triage to be used for any project => Done.
- Make GitLab QA test the Container Registry => 0% complete. Not started, but done other QA tests.
- Make GitLab QA test upgrade from CE to EE => Done.
- Make GitLab QA test simple push with PostReceive => 0% complete. Not started, but done other QA tests.
- De-duplicate at least 5 redundant (feature) tests => 20% complete. Remove one duplicated test.
- Improve at least the 5 longest spec files by at least 30% => 0% complete. Not started due to other priorities.
- Investigate code with less than 60% tests coverage and add tests for at least the 5 most critical files => 0% complete. Not started due to other priorities.
- Investigate encapsulating instance variables about the current page in a class => 0% complete. Not started due to other priorities.
- Reduce duplication in at least 5 forms => 0% complete. Not started due to other priorities.
- Solve at least 3 outstanding performance issues => Done.
- Director of Security
- Compliance framework: Detail Breach Notification Policy - first draft => 80% done, will need further refinement next quarter.
- Compliance framework: Create and develop GDPR checklist => Done, will implement checklist items assigned to security next quarter.
- CTO: Scan source code for security issues. Make it work for 3 popular frameworks.
- CTO: Less effort to merge CE into EE. 10 times less efforts to merge CE to EE
- CTO: Start new projects that might materially affect the scope and future of the company.
- VP Product
- CEO: Partnerships
- VP Product
- Increase adoption of Kubernetes through integration.
- CI/CD: Configure review apps and deployment for projects in less than 5 steps => not done
- Prometheus: Enable Prometheus monitoring of Kubernetes clusters with a single click.
- Platform: Help partners and customers adopt GitLab. Ship authentication and integration requirements.
- Platform: Ship the GNOME requirements. 5 requirements shipped.
- At least 3 GNOME projects migrated to GitLab as part of evaluation
- AWS QuickStart guide published
- VP Product
- CEO: Preparing GitLab.com to be mission-critical
- VP Product
- Improve GitLab.com subscriptions. Storage size increase per subscription level. Ability to upgrade easily.
- Distribution Lead
- Build GCP deployment mechanism on Kubernetes for the migration => 30% complete
- Platform Lead
- Finish Circuit breakers => Circuit breakers are done, but are not enabled in production yet. See this infrastructure issue for more information.
- Gitaly Lead
- 100% of Git operations on GitLab.com go through Gitaly (Gitaly v1.0) => 70% complete
- Demo Gitaly fast-failure during a file-server outage => Done
- Generate a project plan for the GCP migration and get approved by EJ and Sid => Done
- Execute milestone 1 of the GCP migration plan by Dec 15 => Done
- Database Lead
- Demo restore time < 1 hour => postponed until the GCP migration has been completed
- Solve 30% of the schema issues identified by Crunchy => 6.5% done (2 out of 30)
- Database Uptime 99.99% measured in Prometheus => Done!
- SQL timing under 100ms for Issue, MR, project dashboard, and CI pages measured in Prometheus => Improving the 99th percentile has proven to be very difficult, but progress being made.
- Director of Security
- Strong security for SaaS and on-premise product. Top 10 actions from risk assessment done and actions for top 10 risks started. => 50%, will finish next quarter.
- HackerOne bug bounty program. Implemented and bounties awarded. => Done.
- Security policies for cloud services and cloud migrations. Policy published and enacted. => Done.
- VP Product
- CMO: Build trust of, and preference for GitLab among software developers
- CMO: Hire Director, DevRel/Developer Relations => Deferred to Q1.
- CA: Grow community and increase community engagement. Increase number of new contributors by 10%, increase number of total contributions per release by 5% and increase number of Twitter mentions of GitLab by 10%. => Not achieved.
- PMM: Support field marketing at AWS: Reinvent & KubeCon with booth decks and training => Done.
- MSD: $600K in self serve revenue. => 98% achievement.
- MSD: Grow followers by 20% through proactive sharing of useful and interesting information across our social channels. => 40% achievement of target.
- MSD: Grow number of opt-in subscribers to our newsletter by 20%. => Achieved. Grew 47.84%
- CMO: Hire Director, DevRel/Developer Relations => Deferred to Q1.
- CMO: Generate more company and product awareness including increasing lead over Bitbucket in Google Trends => Achieved. GitLab = 65; Bitbucket = 60.
- MSD: Implement SEO/PPC program to drive increase in number of free trials by 20% compared to last quarter, increase number of contact sales requests by 22% compared to last quarter, increase amount of traffic to about.gitlab.com by 9% compared to last quarter => 34% increase in trial request leads; 21% increase in contact request leads; 7% growth QoQ on about.gitlab.com.
- CMO: PR - October Announcements - 10.0, Series C, Wave, CLA => Done!
- CMO: AR - v10 briefing sweep for targeted analysts => Meetings secured for Q1.
Objective 3: Great team
- CFO: Improve team productivity
- Analytics: Data and Analytics vision and plan signed off by executive team
- Analytics: Real time analytics implemented providing visual representation of the metrics sheet
- Legal: Create plan for implementing Global Data Protection and Data Privacy Plan
- Controller: Reduce time to close from 10 days to 9 days.
- Accounting Manager: Identify and add to the handbook two new accounting policies.
- Accounting Manager: Create monthly process for BvsA analysis with department budget owners.
- CCO: Create an Exceptional Corporate Culture / Delight Our Employees
- Launch training for making employment decisions based on the GitLab Values. Launch by November 15th - Moved to Q1 2018
- Launch a short, quarterly Employee Pulse Survey. Strive for 80% completion. Completed, 69.5% completion.
- Analyze and make recommendations based off of New Hire Survey and Pulse surveys which will drive future KRs. Have at least 3 areas to improve each quarter. Ideally, we will also have 3 areas to celebrate. Completed 1/3, other two moved to Q1 2018.
- Revise the format of the Morning Team Calls to allow for better participation and sharing. Strive for 80% participation. Completed.
- Improve use of the GitLab Incentives by 15%. /handbook/total-rewards/incentives/.
- Discretionary Bonus: 0% change from Q3 2017 to Q4 2017, but 50% increase from Q2 2017 to Q3 2017.
- Referral Bonus: 40% increase for hires in Q4.
- Tinggly: Quadrupled the number of awards granted.
- Iterate on the Performance Review process with at least two changes initiated by end of year. - moved to Q1 2018.
- CCO: Grow Our Team With A-Players
- KR: Socialize and grow participation in our Diversity Referral bonuses by 10% (measurement should be made in January as many hires in December don’t start until January, with awareness that the actual bonuses aren’t paid out for 3 months) - 0% increase.
- More sourced recruiting. 20% of total hires - 4.9% of total hires in Q4.
- Ensure candidates are being interviewed for a fit to our Values as well as ability to do the job, through Manager Training and Follow-up by People Ops - moved to Q1 2018.
- Hire Recruiting Director - Completed.
- 90% of all candidates will be advanced through the pipeline within 7 business days in each phase, maximum. - Average time in each stage for all candidates: 11.57 days.
- CCO: Make All of Our Managers More Effective and Successful
- Provide consistent training to managers on how to manage effectively. Success will mean that there are at least 15 live trainings a year in addition to curated online trainings - moved to Q1 2018.
- Ensure every manager is doing regular 1-on-1 meetings with 2-way feedback. Measure will be seen in Employee Pulse survey, with at least 90% of employees indicating they have received feedback from their manager in the last month. - 83.33% agree someone at work has talked to them about their progress.
- Hire People Business Partners to partner with managers to operate as leadership coaches, performance management advisors, talent scouts, and Culture/Values evangelists. Goal of 2 hires. - Completed.
- VPE: Build the best, global, diverse engineering, design, and support teams in the developer platform industry
- Revise hiring plan for Q1 2018 based on Q4 financials and product ambitions => Done
- Launch 2018 Q1 department OKRs before EOQ4 2017 => Done
- Hire an additional Director of Engineering => Job posted, pipeline looks decent, but hire not made
- Hire a production engineers => Job posted, pipeline looks decent, but hire not made
- Support: Grow the support team to better comply with SLAs and cover gitlab.com cases
- Hire a Services Support Manager => Done, Starting in Feb.
- Hire an support specialist
- Hire an EMEA support engineer => Hired AMER as needed.
- Hire an EMEA/AMER support engineer => Done.
- Hire an AMER support engineer => Done, Starting Jan 8th.
- UX: Increase the profile of GitLab design and grow the team
- Launch first iteration of design.gitlab.com => 100% Complete
- Write 3 public blog posts about GitLab UX and visual design case studies, best practices, anecdotes, or events => 100% Complete
- Hire 2 UX designers => Incomplete. Strong pipeline and process in place.
- Hire a junior UX researcher => 100% Complete.
- Frontend (DC): Hire 3 front end developers => Done
- Frontend (AC): Hire 2 front end developers => Done
- VP of Scaling: Hire an Engineering Manager for the Geo team => Not done
- Distribution
- Hire 2 developers => Done.
- Hire a senior developer => Not done, pipeline for senior is weak.
- CI/CD
- Hire 3 developers => 33% complete
- Hire 2 senior developers => Not done
- Discussion: Hire 2 developers
- Gitaly: Hire a developer => Not done.
- Database: Hire a database specialist => Done! Starting end of January 2018
- Director of Quality
- Hire a test automation lead => 20% Complete. Process has been defined and we are screening candidates
- Hire 3 test automation engineers => 20% Complete. Process has been defined and we are screening candidates
- Director of Security
- Hire Security Engineer(s) => Done, two hires starting in January.
- Hire a Security Specialist Developer => Not a security team hire, but a product developer (SAST).
Retrospective
VPE
- GOOD
- Anecdotally, Engineering had been setting promises and achieving ~20-30% of them in past quarters. It’s important to set aggressive but achievable goals that both motivate the team, but also accurately represent our bandwidth to the rest of the organization. Otherwise, the wrong business decisions are made. My goal was to raise our achievement somewhat, which I think we’ve done. Eventually, I would like to be regularly hitting 70-80% of our OKRs–more when things go spectacularly. But it will take time to dial this is, because we do not want to encourage sandbagging.
- The focus on hiring velocity and quality improved dramatically. Teams such as design, frontend, and support did a great job, meeting their plan, raising the bar higher, and making the process more efficient.
- We gave engineering teams many goals that were fully under their control, such as resolving test debt, improving code quality, and improving the hygiene of their backlog and most delivered on these
- We got 2018 Q1 OKRs drafted before the end of the quarter, which helps with adoption
- We delivered geo
- Two revisions of the hiring plan were delivered to the board
- We kicked off the GCP migration project and delivered a milestone
- BAD
- These goals were set in my 2nd week at GitLab, so some missed the mark (which was known to be likely to happen)–I know much more for Q1
- Some teams lagged behind in hiring, only getting their vacancies up in Late November after being pressed
- My own hires (some of which were inherited from infrastructure) were not made
- The late start and holiday season made it very unlikely that some of our hires would be made (but it was important to capture them on the record, rather than leave them off)
- My goal to enhance our process somehow meandered through the quarter. It started as starting an estimation process, eventually becoming a taxonomy change
- My time spent with the infrastructure team took the place of several other things I hoped to do–unfortunate, but the right call
- TRY
- Anticipate the holiday season next year and front-load hires in Q3
- Push for all vacancies to be posted in the first full business week of each quarter
- Assign each team a goal to deliver 100% of commitments for releases (and find a way to measure it)
- Find a way to incentivize hiring in efficient regions
- Assign each applicable team a goal to merge a certain amount of contributions from the community
Director of Backend
- GOOD
- We made Geo Generally Available after a frenetic development effort
- We hired a number of strong backend developers who were able to contribute from Day 1
- We successfully migrated a Tier 1 customer to GitLab
- We shipped major features, such as GPG-signed keys, with the help of the community
- We expanded the use of GitLab QA, adding a Mattermost integration test, and caught a number of regressions
- BAD
- We underestimated the amount of work required to make hashed storage production-ready
- The shuffling of people to Geo, while helpful for Geo, slowed down other teams
- A lot of migrations and features caused unplanned downtime with GitLab.com
- Prometheus metrics got close to running in production, but we still had to turn it off on GitLab.com
- Our customers are still experiencing performance issues, particularly with API access
- We broke LDAP logins (again) in 9.5.0 and still do not have integration test for this
- TRY
- Make adding integration testing a priority instead of an afterthought (starting with Geo)
- Get more team members involved with identifying significant bugs in Sentry
- Improve overall security release process by defining roles and expectations of release managers, developers, and security team
- Increase team productivity by scheduling pairing sessions with different developers
UX Design Manager
- GOOD
- We documented the majority of existing and UX Ready design patterns in GitLab
- Establishing a pattern library for designers has sped up the design cycle significantly. Designers can quickly put together an entire UI and know it contains the latest standards. This will ensure consistency across the team and application
- As a side-effect, the pattern library has brought to light major usability improvements the UX team has defined but not been successful in getting implemented. We will focus on pushing these improvements into the app
- We hired a Jr. UX Researcher
- We completed phase one of the UI Repository with the help of FE
- We succeeded in publishing three blog posts related to UX vision and implementation. The response from the design community has been positive and we look forward to establishing GitLab as an Open Source Design authority
- BAD
- We were not successful in hiring the two UX Designers we need in spite of our efforts
- We were not successful in closing the loop on UX standards and guidelines. The goal was to deprecate the existing UX Guide in favor of design.gitlab.com but review of standards took longer than anticipated
- Not accounting for holidays and vacations in planning led to some missed deliveries
- TRY
- Focus on getting existing usability improvements implemented (UX Ready, Deferred UX, UI Polish)
- Continue collaborating with FE on UI Repository and UX Backlog cleanup
- Push harder for significant iterative UX improvements in each release
- Anticipate holiday season’s influence on ability to deliver
Staff Developer, Database
- GOOD
- Despite the large OKR we managed to solve a lot of performance issues.
- We improved the team workflow by using issue boards more actively and by having a weekly database meeting.
- We managed to add health / uptime monitoring to Prometheus / Grafana, allowing us to see how the database health changes over time. This is based on the number of alerts sent out, not the uptime of the database.
- We managed to hire a 3rd database specialist.
- We rewrote the GitHub importer from scratch, resulting in much better performance.
- We wrote a (popular) blog post about scaling the database: </2017/10/02/scaling-the-gitlab-database/>
- We managed to optimize retrieving CI pipeline statuses, which used to execute very slow SQL queries.
- BAD
- We added far too much work to the Q4 OKR, resulting in us only being able to complete a small portion of the planned work.
- We didn’t take the summit into account when planning the OKR.
- One database specialist was unavailable for a few weeks due to having to move to a different apartment. This lead to a reduction in productivity of the team as a whole.
- There were too many issues that required the help of others, some of these were not worked on for several weeks.
- We estimated we’d be able to complete 30 schema issues, but only ended up completing two of them.
- 10.3 had a few bad migrations causing trouble.
- TRY
- Schedule more well defined issues for an OKR so we can actually solve them.
- Make it harder to introduce performance problems (planned for Q1 of 2018).
- Assign database specialists to specific areas instead of having them take care of everything (https://gitlab.com/gitlab-com/infrastructure/issues/3139).
- Delegate more work to the other teams so database specialists don’t have to do so much one their own.
Director of Security
- GOOD
- Established HackerOne private paid bug bounty program and monetarily awarded researcher.
- Hired two security team members, in Application Security and Security Automation.
- Drafted and published GCP Security Guidelines.
- Created Security Vision for 2018 and beyond, including hiring plan.
- BAD
- Security became a team of one during Q4, and that impacted our ability to deliver on some OKRs.
- My goal to deliver on all Top 10 Security Risks Assessment was superseded by needing to spend significant time transitioning non security-impacting tasks to other teams.
- Security Scanner PM role was assigned to me, without a lot of context. Ultimately, that became challenging, because the scope was much larger than anticipated. So, we made the decision to have this role transitioned. My goal is to be more engaged once the security team is larger.
- TRY
- Find a methodology to analyze recurring security vulnerability types, and work towards mitigating clusters of vulnerabilities.
- Work towards creating more automation of security tasks, to scale our team.
- Continue to provide cross-functional security guidance, but through issues and MRs much more frequently, now that workflow fluency is established.
Engineering Manager, Platform Backend
- GOOD
- It’s hard to determine after the fact what percentage of deliverables we actually manage to ship each release, because issues that slip have their milestone adjusted to a future release, instead of keeping the milestone of the release they were originally scheduled for. My perception, however, is that we’ve shipped about 75% of deliverables each release, with that number closer to 70 for 10.2 (because of the summit in October) and 10.4 (because of the holidays in December), and closer to 80 for 10.3 when we had 4 uninterrupted weeks of work.
- We resolved 52% of priority 1 & 2 Platform issues, and scheduled another 30%.
- We added 3 developers to the team, including 2 seniors.
- BAD
- We lost 5 of the 10 people we started the quarter with to other teams (4 to Geo and 1 to CI/CD), including 2 of the new team members, obviously affecting our ability to get stuff done significantly.
- We didn’t add GitLab QA tests for either backup/restore or LDAP.
- We didn’t identify a sub-standard area of the code base and raise local unit test coverage up to project level.
- Circuit breakers are done, but are not enabled in production yet. See this infrastructure issue for more information.
- We only resolved 52% of priority 1 & 2 Platform issues. We scheduled another 30%, but 17% was left untouched. (Numbers don’t add up to 100% because of rounding.)
- We didn’t manage to resolve any significant tech debt or make progress on engineering tasks without immediate user-facing benefit, like shipping a first iteration of a GraphQL API, or migrating to Rails 5.
- TRY
- Adjust OKRs during the quarter if circumstances (like team capacity) changes significantly.
- Create well defined issues for OKRs to make it harder to lose track of them.
- More proactively keep an eye on SP1, SP2, AP1, AP2, SL1 and SL2 issues.
- Proactively schedule and allocate time for tech debt and “pure” engineering tasks. This will become easier as the team gains people again.
Engineering Manager, Discussion Backend
- GOOD
- No major features slipped.
- All AP1 and AP2 issues are done or scheduled.
- We added one senior developer to the team.
- BAD
- We didn’t look at SP2 issues at all.
- The GitLab QA tests we wrote aren’t done, because they got overtaken by AP issues.
- Some developers ended up with too many issues to work on, because others were blocked and became unblocked.
- We didn’t get any closer to migrating to Rails 5.
- Solving AP1 and AP2 issues often means writing migrations, which then can have a bad production impact on GitLab.com, and we didn’t do a good job of catching those early.
- Migrating uploads to object storage was more complicated than expected.
- Some developers ended up working on a lot of OKR-focused issues, others worked on very few.
- TRY
- Ensuring that we reduce the bug backlog by having every developer work on the backlog.
- Keeping better track of important issues (like SP2 issues), perhaps with an explicit monthly refinement of that list.
- Expressing targets in OKRs as a count of issues to solve, with the backlog size recorded for reference, to make future retros easier.
- Being more conservative about adding new issues when someone has issues in progress at the end of the release.
- Spreading OKR-related issues better among the team.
Frontend Engineering Manager
- GOOD
- We had good results for our targeted OKR’s, especially with adding a lot new Karma tests and crushing bugs
- Overall good results on our deliverables
- We closed 5 great frontend hires and fulfilled our hiring plan
- We made constant progress on our pure engineering tasks and performance improvements (Modules for Webpack, Libraries, SVG’s, etc.)
- We reduced the number of regressions produced per release
- BAD
- Our big refactorings/restructering got way too big, which led to review & merging problems and then slipped cut-off times
- There are still way too many merges around the 7th, which led to conflicts, broken versions, failing masters, etc.
- Library updates were a 50:50 chance if they would be easy or hugely complex
- Sometimes technical topics were rushed and too many at a time which then led to confusion in the team
- Our CSS debt is growing release by release
- We were not able to push forward the next big frontend performance topics which need intra team collaboration (images and gzip) except the CDN topic
- TRY
- Establish our new team structure which will give us the benefit to drive forward technical improvements fast, which will make us way more productive on the long run (by example stable Vue components), and on the other hand more flexible planning for release cycles over the different product areas
- Unified scheduling plans over all product areas and establish good predictability on our velocity
- Make everyone aware clearly about our OKR’s and our overall progress on them
- Solve one big technical change at a time with a clear and communicated plan
- Integrate 5 new team members with good onboarding and be ahead of our hiring plan so hires are made according to our plan
- Continue working closely with UX not only on deliverables but also on broader topics like the component library, SVG’s and more
- Better documentation and tooling to support our overall team size and the overall frontend development workflow
- Clear communication and insights from our side when a deliverable is becoming problematic and maybe slip, needs more attention, etc.
CI/CD Lead
- GOOD
- We seem to be able to ship around 75-85% of our Deliverables every month, we also start to notice to have less overload with the issues
- We are continually investing in performance/scalability fixes every month, and we see big improvements in resiliency of CI/CD infrastructure, especially from Database perspective
- We manage to solve significant amount of Technical Debt every month
- We started having own CI/CD Retrospective around one week before company’s, it helped us to voice and address our problems and contribute better to company-wide one
- We are continuing to improve our monitoring capabilities by having end-to-end monitoring of all CI infrastructure running by GitLab Inc.
- We are able to deal with abuses on GitLab.com, react fast, and this no longer affects stability of the CI
- We are aware of cost of maintenance/change and we always accommodate that in our architectural choices, by forcing us in implementing not MEP (engineering product), but MMP (maintainable product)
- BAD
- We only solved 33% of priority issues. We scheduled only 18% of them to be resolved. Most of the rest are hard/yet-to-impossible to solve without conclusive Product decision as they are feature requests
- We didn’t manage to meet our hiring expectations
- We started monitoring some of our OKRs only at the end of Q4 which resulted in some of them being not completed
- We struggle with lack of large-scale database knowledge which results in slower velocity on shipping some of the changes
- TRY
- Make OKRs focus better on the achievable goals for the team
- More proactively keep an eye on SP1, SP2, AP1, AP2, SL1 and SL2 issues
- Help develop and work closely with the database specialists to ensure that we can improve the database architectural changes velocity
Support Engineering Manager
- GOOD
- With the help and support of the VPE, we hit 4/5 of our aggressive hiring goals this quarter.
- With Team size + skills increase, we were able to see dramatic improvements in our SLAs.
- We limited our OKRs to focused, attainable OKRs which absolutely lead to our success.
- BAD
- We are still struggling to hire in EMEA. It’s been our hardest challenge.
- Support SLAs need to be at 100% ASAP, but it will take time to train + hire to meet demands.
- GitLab.com Support Experience is not where it should be.
- TRY
- New Accountability processes in support to avoid SLA Breaches
- Using Active Sourcing targeting EMEA to fill our hiring needs
- Focus on training around new complexities (HA/Kubernetes) in preparation for future demand.
Quality & Edge
- GOOD
- We automated the CE->EE merges.
- We automated the triaging for several projects (gitlab-ce, gitlab-runner, gitlab-qa).
- We increased the Branches page speed by 2x.
- We increased our contributions to QA, the team is ready to be productive on this matter for Q1 2018.
- We shipped 2 new RuboCop cops.
- We contributed extensively to the documentation.
- BAD
- Two developers (almost half of the team) were release managers in October/November so we got less things merged during these months.
- We tend to just work on issues that we triage or that we find in reviews, instead of balancing that kind of work with the OKRs.
- Some work has been done towards non-written OKRs, such as “CTO: Less effort to merge CE into EE. 10 times less efforts to merge CE to EE”.
- This objective was definitely in Edge’s scope but we should’ve replaced some other minor OKRs with this one to reflect the reality of where the focus was.
- One team member was responsible for the CE->EE daily merge, that can take a significant amount of time depending on the conflicts and/or CI failures, which obviously gives less time to achieve the OKRs we’ve set.
- Developers spend time reviewing community merge requests, which obviously gives less time to achieve the OKRs we’ve set.
- Documentation improvements and new RuboCop cops weren’t part of the OKRs, but are still important.
- As we don’t have Product telling us “what to ship”, we tend to contribute a lot of small changes but I have the feeling we lack “big” achievements because of that.
- This can lead to less motivation, as we have less “big” things to celebrate.
- TRY
- Ensuring that at the end of each week, every one on the team has made progress toward an OKR.
- Ensuring that community merge requests are being triaged and reviewed (or assigned) in a timely fashion.
Engineering Manager, Distribution
- GOOD
- We are delivering around 90% of scheduled items every month
- The team is increasing their Kubernetes experitise
- Solid progress has been made with the Cloud Native charts, contributing to one of the highest company priorities while keeping customer requirements in mind
- We keep delivering stable package releases, blocking problems related to the package are addressed within one patch release
- We invested time in consolidating our package building infrastructure that allowed us to have better visibility in infrastructure costs
- We were able to help out with non-team tasks such as GitLab QA
- Number of actionable user reported issues has stagnated
- Improved the team communication by establishing a team specific issue tracker and process
- We considerably improved HA installation experience
- We delivered a PG HA solution, used on GitLab.com at scale
- Improved user experience by informing of package deprecations and better error messages, which has also prevented issues internal to company
- Continued regularly decreasing technical debt
- Continued with team training sessions
- BAD
- We did not have time to start working on measuring installation time OKR
- The hiring pipeline for a Senior level developer is poor
- We still get tasks added last minute when dependencies or production readiness is not considered during feature planning
- Significant time is used by the team on legacy projects
- The issue queue is not decreasing
- Some of the major tasks that the team has focused on was not described in the OKRs
- TRY
- Establish better tooling that would allow more visibility internal and external to the team
- Automate tasks that are external team dependencies
- Review legacy projects for automation or deprecation
Gitaly Lead
- GOOD
- A major Gitaly bug was resolved.
- Good progress was made at the end of the quarter on acceptance testing.
- BAD
- By several measures, the Gitaly team made less progress than in the previous quarter.
- For example:
- Q4: 113 vs. Q3: 179 merge requests accepted on Gitaly repo.
- Q4: 202 vs. Q3: 329 issues closed on the Gitaly repo.
- These of course are not necessarily indicators of progress, but the differences should be investigated further.
- For example:
- Two bugs, the Gitaly Lockup bug and the
GRPC Call dropped by load balancing policy error bug took up a significant amount of time.
- Both of these bugs were in underlying libraries and there was little we could do to mitigate them.
- The former was fixed by a change to the Go runtime
- The latter was a GRPC bug which we worked around by rolling back to an old version
- Gitaly lead role reduced to 20% of Andrew’s time
- Distractions around team changes (Jacob possibly leaving team, Andrew’s new split role)
- Recruiting for a position until January which has been closed since December
- Very little uptake in Gitaly open recruiting position, probably due to confusing title
- After the title being changed from “Backend Developer, Gitaly” to “Backend Developer, Ruby and Go”, there was a surge in candidates.
- By several measures, the Gitaly team made less progress than in the previous quarter.
- TRY
- Better communication around head-count updates.
- Gitaly should get Product representation. Headcount loss was a result of decision made at a meeting of the Product Team, at which Gitaly was not represented.
- (Too late for Gitaly, but…) For future large migration projects: better upfront analysis and planning of scope.
- In future, do not use the team name as part of the recruiting position title: try stick to required skills and experience if possible.
Engineering Manager, Geo
- GOOD
- GitLab Geo was shipped as Generally Available in 10.2.
- To get Geo to GA, the team ramped up very quickly in October. Shifting priorities can be a stressor for people, teams, and the organization at large, in this case it went remarkably smoothly with a lot of support from everyone involved.
- With guidance of the VPE, the Geo team started doing weekly demos, which were very valuable to the development process and are now a standard practice on the team.
- Other teams - notably Solutions Architects - also demo-ed Geo and this “external” set of eyes on the product also contributed to better documentation, UX improvements, etc.
- BAD
- There was significant slippage from milestone to milestone. This is due to (i) focusing on too many objectives per milestone and (ii) recovering from the push to get to GA in 10.2 which involved committing features well beyond the feature freeze windows in 10.2, and then also in 10.3.
- Discovered that hashed storage had been rolled out in 10.0 as GA when in fact it can be considered alpha or beta at the moment. This had knock-on effects for how we plan(ned) to use Geo for the GCP Migration; see the ongoing discussion on the fate of hashed storage.
- Not “bad” per sé, but noticed by careful tracking that our estimates for calendar days spent on issues is at about 50% of actual.
- TRY
- Correcting for the points listed in “bad” by not allowing exceptions to feature freeze in 10.4, and committing to a single focus in 10.5.
- Better estimation of calendar days spent by (i) allowing 0.5 days as the smallest unit of time, (ii) reducing scope of any issue sleighted to take more than 5 days, (iii) use past data to set more realistic targets for following milestones.
Prometheus Lead
- GOOD
- We shipped major performance improvements to the Ruby Prometheus client library.
- Prometheus 2.0 work helped production greatly scale to handle additional metrics load as we expand what is possible to collect.
- BAD
- We are very short staffed.
- There was some slippage due to miscommunication between frontend and backend development.
- TRY
- Hiring is a prioirity for Q1.
- We are working on improving communication by brining up blocking issues between frontend and backend more than once a week, and making sure we communicate clearly about these blockages.
Last modified November 14, 2024: Fix broken external links (
ac0e3d5e
)