Month: September 2018

Like Drinking Poison

I remember watching the Truth and Reconciliation Commission hearings broadcast back in 1996. It was my first exposure to the idea of restorative justice, and an incredible lesson in forgiveness. Do I consider the slight against me more heinous than murder and rape that withholding forgiveness might be justified? The weekly broadcast was a reminder of how powerful forgiveness can be.

Juxtaposing those hearings with the Ford/Kavanaugh debacle yesterday highlights a failing in the guilt/punishment driven “justice” system.  There are certain crimes where it makes sense to ascribe guilt and mete out punishment — financial reparations when someone has caused monetary loss, removing a person from society when they are apt to continue harming others. I doubt that is the case here. I know what I wanted from the guy who assaulted me — for him to own it and to learn from it. Had Kavanaugh admitted to terrible behavior as a young man, said that there were occasions when he was so drunk he does not recall his actions (thus cannot confidently recall or deny assaulting Ford), been truly remorseful for both his behavior and how his behavior impacted other people, and acknowledged that he is aware of his faults and has stopped drinking (or ceased drinking to excess) and acting in a domineering/privileged way … but, no. We get a belligerent assertion that he is not, well, belligerent. And never drank to excess.

I’d still object to a SCoTUS nominee who thinks it’s debatable whether a president can be indicted while in office. Or that criminal investigations into a president should be deferred until he is out of office. Or that George W approach to torture, extrajudicial detention, and war was a bit of all right. But I would at least get the desire to forgive someone for their decades-old actions if they owned those actions, regretted those actions, and changed.

LDAP Auth With Tokens

I’ve encountered a few web applications (including more than a handful of out-of-the-box, “I paid money for that”, applications) that perform LDAP authentication/authorization every single time the user changes pages. Or reloads the page. Or, seemingly, looks at the site. OK, not the later, but still. When the load balancer probe hits the service every second and your application’s connection count is an order of magnitude over the probe’s count … that’s not a good sign!

On the handful of sites I’ve developed at work, I have used cookies to “save” the authentication and authorization info. It works, but only if the user is accepting cookies. Unfortunately, the IT types who typically use my sites tend to have privacy concerns. And the technical knowledge to maintain their privacy. Which … I get, I block a lot of cookies too. So I’ve begun moving to a token-based scheme. Microsoft’s magic cloudy AD via Microsoft Graph is one approach. But that has external dependencies — lose Internet connectivity, and your app becomes unusable. I needed another option.

There are projects on GitHub to authenticate a user via LDAP and obtain a token to “save” that access has been granted. Clone the project, make an app.py that connects to your LDAP directory, and you’re ready.

from flask_ldap_auth import login_required, token
from flask import Flask
import sys

app = Flask(__name__)
app.config['SECRET_KEY'] = 'somethingsecret'
app.config['LDAP_AUTH_SERVER'] = 'ldap://ldap.forumsys.com'
app.config['LDAP_TOP_DN'] = 'dc=example,dc=com'
app.register_blueprint(token, url_prefix='/auth')

@app.route('/')
@login_required
def hello():
return 'Hello, world'

if __name__ == '__main__':
app.run()

The authentication process is two step — first obtain a token from the URL http://127.0.0.1:5000/auth/request-token. Assuming valid credentials are supplied, the URL returns JSON containing the token. Depending on how you are using the token, you may need to base64 encode it (the httpie example on the GitHub page handles this for you, but the example below includes the explicit encoding step).

You then use the token when accessing subsequent pages, for instance http://127.0.0.1:5000/

import requests
import base64

API_ENDPOINT = "http://127.0.0.1:5000/auth/request-token"
SITE_URL = "http://127.0.0.1:5000/"

tupleAuthValues = ("userIDToTest", "P@s5W0Rd2T35t")

tokenResponse = requests.post(url = API_ENDPOINT, auth=tupleAuthValues)

if(tokenResponse.status_code is 200):
jsonResponse = tokenResponse.json()
strToken = jsonResponse['token']
print("The token is %s" % strToken)

strB64Token = base64.b64encode(strToken)
print("The base64 encoded token is %s" % strB64Token)

strHeaders = {'Authorization': 'Basic {}'.format(strB64Token)}

responseSiteAccess = requests.get(SITE_URL, headers=strHeaders)
print(responseSiteAccess.content)
else:
print("Error requesting token: %s" % tokenResponse.status_code)

Run and you get a token which provides access to the base URL.

[lisa@linux02 flask-ldap]# python authtest.py
The token is eyJhbGciOiJIUzI1NiIsImV4cCI6MTUzODE0NzU4NiwiaWF0IjoxNTM4MTQzOTg2fQ.eyJ1c2VybmFtZSI6ImdhdXNzIn0.FCJrECBlG1B6HQJKwt89XL3QrbLVjsGyc-NPbbxsS_U:
The base64 encoded token is ZXlKaGJHY2lPaUpJVXpJMU5pSXNJbVY0Y0NJNk1UVXpPREUwTnpVNE5pd2lhV0YwSWpveE5UTTRNVFF6T1RnMmZRLmV5SjFjMlZ5Ym1GdFpTSTZJbWRoZFhOekluMC5GQ0pyRUNCbEcxQjZIUUpLd3Q4OVhMM1FyYkxWanNHeWMtTlBiYnhzU19VOg==
Hello, world

A cool discovery I made during my testing — a test LDAP server that is publicly accessible. I’ve got dev servers at work, I’ve got an OpenLDAP instance on Docker … but none of that helps anyone else who wants to play around with LDAP auth. So if you don’t want to bother populating directory data within your own test OpenLDAP … some nice folks provide a quick LDAP auth source.

Using Microsoft Graph

Single Sign-On: Microsoft Graph

End Result: This will allow in-domain computers to automatically log in to web sites and applications. Computers not currently logged into the company domain will, when they do not have an active authenticated session, be presented with Microsoft’s authentication page.
Requirements: The application must be registered on Microsoft Graph.

Beyond that, requirements are language specific – I will be demonstrating a pre-built Python example here because it is simple and straight-forward. There are examples for a plethora of other languages available at https://github.com/microsoftgraph

Process – Application Development:
Application Registration

To register your application, go to the Application Registration Portal (https://apps.dev.microsoft.com/). Elect to sign in with your company credentials.

You will be redirected to the company’s authentication page

If ADSF finds a valid token for you, you will be directed to the application registration portal. Otherwise you’ll get the same logon page you see for many other MS cloud-hosted apps. Once you have authenticated, click “Add an app” in the upper right-hand corner of the page.

Provide a descriptive name for the application and click “Create”

Click “Generate New Password” to generate a new application secret. Copy it into a temporary document. Copy the “Application Id” into the same temporary document.

Click “Add Platform” and select “Web”

Enter the appropriate redirect/logout URLs (this will be application specific – in the pre-built examples, the post-authentication redirect URL is http://localhost:5000/login/authorized

Delegated permissions impersonate the signed in user, application permissions use the application’s credentials to perform actions. I use delegated permissions, although there are use cases where application permissions would be appropriate (batch jobs, for instance).

Add any permissions your app requires – for simple authentication, the default delegated permission “User.Read” is sufficient. If you want to perform additional actions – write files, send mail, etc – then you will need to click “Add” and select the extra permissions.

Profile information does not need to be entered, but I have entered the “Home page URL” for all of my applications so I am confident that I know which registered app corresponds with which deployed application (i.e. eighteen months from now, I can still figure out site is using the registered “ADSF Graph Sample” app and don’t accidentally delete it when it is still in use).

Click Save. You can return to your “My Applications” listing to verify the app was created successfully.

Application Implementation:

To use an example app from Microsoft’s repository, clone it.

git clone https://github.com/microsoftgraph/python-sample-auth.git

Edit the config.py file and update the “CLIENT_ID” variable with your Application Id and update the “CLIENT_SECRET” variable with your Application Secret password. (As they note, in a production implementation you would hash this out and store it somewhere else, not just drop it in clear text in your code … also if you publish a screen shot of your app ID & secret somewhere, generate a new password or delete the app registration and create a new one. Which is to say, do not retype the info in my example, I’ve already deleted the registration used herein.)

Install the prerequisites using “pip install -r requirements.txt”

Then run the application – in the authentication example, there are multiple web applications that use different interfaces. I am running “python sample_flask.py”

Once it is running, access your site at http://localhost:5000

The initial page will load; click on “Connect”

Enter your company user ID and click “Next”

This will redirect to the company’s sign-on page. For in-domain computers or computers that have already authenticated to ADSF, you won’t have to enter credentials. Otherwise, you’ll be asked to logon (and possibly perform the two-factor authentication verification).

Voila, the user is authenticated and you’ve got access to some basic directory info about the individual.

Process – Tenant Owner:
None! Any valid user within the tenant is able to register applications.
Implementation Recommendations:
There is currently no way to backup/restore applications. If an application is accidentally or maliciously deleted, a new application will need to be registered. The application’s code will need to be updated with a new ID and secret. Documenting the options selected when registering the application will ensure the application can be re-registered quickly and without guessing values such as the callback URL.

There is currently no way to assign ownership of orphaned applications. If the owner’s account is terminated, no one can manage the application. The application continues to function, so it may be some time before anyone realizes the application is orphaned. For some period of time after the account is disabled, it may remain in the directory — which means a directory administrator could re-enable the account and set the password to a known value. Someone could then log into the Microsoft App Registration Portal under that ID and add new owners. Even if the ID has been deleted from the directory, it exists as a tombstone and can be restored for some period of time. Eventually, though, the account ceases to exist — at which time the only option would be to register a new app under someone else’s ID and change the code to use the new ID and secret. Ensure multiple individuals are listed as the application owner helps avoid orphaned applications.

Edit the application and click the “Add Owner” button.

You can enter the person’s logon ID or their name in “last, first” format. You can enter their first name – with a unique first name, that may work. Enter “Robert” and you’re in for a lot of scrolling! Once you find the person, click “Add” to set them up as an owner of the application. Click “Save” at the bottom of the page to commit this change.

I have submitted a feature request to Microsoft both for reassigning orphaned applications within your tenant and for a mechanism to restore deleted applications — apparently their feature requests have a voting process, so it would be helpful if people would up-vote my feature request.

Ongoing Maintenance:
There is little ongoing maintenance – once the application is registered, it’s done.

Updating The Secret:

You can change the application secret via the web portal – this would be a good step to take when an individual has left the team, and can be done as a proactive security step as a routine. Within the application, select “Generate New Password” and create a new secret. Update your code with the new secret, verify it works (roll-back is to restore the old secret to the config – it’s still in the web portal and works). Once the application is verified to work with the new secret, click “Delete” next to the old one. Both the create time and first three characters of the secret are displayed on the site to ensure the proper one is removed.

Maintaining Application Owners:

Any application owner can remove other owners – were I to move to a different team, the owners I delegated could revoke my access. Just click the “X” to the far right of the owner you wish to remove.

 

 

Building A Jenkins Sandbox

You can use a pre-built docker container (the “long term support” iteration is published as jenkins/jenkins:lts) or perform a local installation from https://jenkins.io/download/, add a package repo to your package manager config (http://pkg.jenkins-ci.org/redhat-stable/jenkins.repo for RedHat-based systems), or build it from the source repo. In this sandbox example, I will be using a Docker container.

Map the /var/Jenkins_home value to something. This allows you to store user-specific data on your local drive, not within the Docker image. In my case, c: is shared in Docker and I’m using c:\docker\jenkins\jenkins_home to store the data.

I have a java cacerts file mounted to the container as well – my CA chain has been imported into this file, and the default password, changeit, is used. This will allow Java to trust internally signed certificates. The keystore password appears as part of the process (i.e. anyone who can run commands like “ps aux” or “ps -efww” will see this value, so while security best practices dictate the default password should be changed … don’t change it to something like your root password!).

We can now start the Docker container:

docker run -p 8080:8080 -p 50000:50000 -v /c/docker/jenkins/jenkins_home:/var/jenkins_home -v /c/docker/jenkins/cacerts:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/security/cacerts jenkins/jenkins:lts

Once the container is running, you can visit the management web site (http://localhost:8080) and install the modules you want – or just take the defaults (you’ll end up with ‘stuff’ you don’t need … I don’t use subversion, for instance, and don’t really need a plugin for it). For a sandbox, I accept the defaults and then use Jenkins => Manage Jenkins => Manage Plug-ins to remove obviously unnecessary ones. And add any that may be needed (e.g. if you are using Visual Studio solution files, add in the MSBuild plugin).

 

Configuring Authentication (LDAP)

First install the appropriate plug-in – referrals cause authentication problems when using AD as the LDAP authentication source, if you are using AD for authentication … use the Active Directory plugin).

Manage Jenkins => Configure Global Security. Under access control, select the radio button for “LDAP” or “Active Directory”. Configuration is implementation specific.

AD:

Click the button to expand the advanced configuration. You should not need to specify a domain controller if service records for the domain are present in DNS. The “Site” should be “UserAuth”. For the Bind DN, you can use your userid (user@domain.ccTLD or domain\uid format) with your password. Or you can create a dedicated service account – for a “real world” implementation, you would want a dedicated service account (using *your* account means you’ll need to update your Jenkins config whenever you change your password … and when you forget this update, auth fails).

A note about the group membership lookup strategy:

For some reason, Jenkins assumes recursive group memberships will be used (e.g. there is a “App XYZ DevOps Team” that is placed into the “Jenkins Users” group, and “Jenkins Users” is assigned authorizations within the system). Bit of a shame that “none” isn’t an option for cases where there isn’t hierarchical group membership being built out.

There are three lookup strategies available: recursive group queries, LDAP_MATCHING_RULE_IN_CHAIN, and Toke-Groups user attribute. There have been bugs in the “Automatic” strategy that caused timeout failures. Additionally, the group list returned by the three strategies is not identical … so it is possible to have inconsistent authorization results as different strategies are used. To ensure consistent behaviour, I select a specific strategy.

Token-Groups: If you are not using Distribution groups within Jenkins to assign authorization (and you probably shouldn’t since it’s a distribution group, not a security group), you can select the Token-groups user attribute to handle recursive group membership. Token-groups won’t work if you are using distribution groups within Jenkins, though, as only security groups show up in the token-groups attribute.

LDAP_MATCHING_RULE_IN_CHAIN: OID 1.2.840.113556.1.4.1941, LDAP_MATCHING_GROUP_IN_CHAIN is an extended matching operator (something Microsoft added back in Windows 2003 R2) that can be used in LDAP filters:

(member:1.2.840.113556.1.4.1941:=cn=Bob,ou=ResourceUsers,dc=domain,dc=ccTLD)

This operator has known issues with high fan-outs and can cause hangs while data is retrieved. It is, however, a more efficient way of handling recursive group memberships. If your Jenkins groups contain only users, you will not encounter the known issue. If you are using nested groups, my personal recommendation would be to test each option and time logon activities … but if you do not wish to perform a test, this is a good starting option.

Recursive Group Queries: Jenkins issues a new LDAP query for each group – a lot of queries, but straight-forward queries. This is my last choice – i.e. if everything else hangs and causes poor user experience, try this selection.

For Active Directory domains that experience slow authentication through the AD plug-in regardless of the selected recursion scheme, I’ve set up the LDAP plug-in (it does not handle recursive group memberships) but it’s not a straight-forward configuration.

LDAP:

Click the button to expand the advanced server configuration. Enter the LDAP directory connection details. I usually start with clear text LDAP. Once the clear text connection tests successfully, the certificate trust can be established.

You can add a group search filter, but this is not required. If you request your group names start with a specific string, e.g. my ITSS CSG organization’s Jenkins server might use groups that start with ITSS-CSG-Jenkins, you can add a cn filter here to restrict the number of groups your implementation looks through to determine authorization. My filter, for example, is cn=ITSS-CSG-Jenkins*

Once everything is working with clear text, load the Root and Web CA public keys into your Java instance’s cacerts file (if you have more than once instance of Java and don’t know which one is being used … either figure out which one is actually being used or repeat the keytool commands for each cacerts file on your machine).

In the Docker container, the file is /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/security/cacerts and I’ve mapped in from a locally maintained cacerts file that already contains our public keys for our CA chain.

Before saving your changes, make sure you TEST the connection.

Under Authorization, you can add any of your AD/LDAP groups and assign them rights (make sure your local back door account has full rights too!).

Finally, we want to set up an SSL web site. Request a certificate for your server’s hostname (make sure to include a SAN if you don’t want Chrome to call your cert invalid). Shell into the Docker instance, cd into $JENKINS_HOME, and scp the certificate pfx file.

Use the keytool command to create a JKS file from this PFX file – make sure the certificate (PFX) and keystore (JKS) passwords are the same.

Now remove the container we created earlier. Don’t delete the local files, just “docker rm <containerid>” and create It again

docker run –name jenkins -p 8443:8443 -p 50000:50000 -v /c/docker/jenkins/jenkins_home:/var/jenkins_home -v /c/docker/jenkins/cacerts:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/security/cacerts jenkins/jenkins:lts –httpPort=-1 –httpsPort=8443 –httpsKeyStore=/var/jenkins_home/jenkins.cert.file.jks –httpsKeyStorePassword=keystorepassword

Voila, you can access your server using an HTTPS URL. If you review the Jenkins documentation, they prefer leaving the Jenkins web server on http and using something like a reverse proxy to perform SSL offloading. This is reasonable in a production environment, but for a sandbox … there’s no need to bring up a sandbox Apache server just to configure a reverse proxy. Since we’re connecting our instance to the real user passwords, sending passwords around in clear text isn’t a good idea either. If only you will be accessing your sandbox (i.e. http://localhost) then there’s no need to perform this additional step. The server traffic to the LDAP / AD directory for authentication is encrypted. This encryption is just for the client communication with the web server.

 

Using Jenkins – System Admin Stuff

There are several of “hidden” URLs that can be used to control the Jenkins service (LMGTFY, basically). When testing and playing with config parameters, restarting the service was a frequent operation, so I’ve included two service restart URLs here:

   https://jenkins.domain.ccTLD:8443/safeRestart ==> enter quiet mode, wait for running builds to complete, then restart

   https://jenkins.domain.ccTLD:8443/restart ==> Restart not so cleanly

Multiple discussions about creating a more fault tolerant authentication scheme within Jenkins exist on their ‘Issues’ site. Currently, you cannot use local accounts if the directory service is unavailable. Not a big deal if you’re on the company network and using one of our highly available directory solutions. Bit of a shocker, though, if your sandbox environment is on your laptop and you try to play with the instance when not on the company network. In production implementations, this would be a DR consideration (dependency on the directory being recovered). In a cloud-hosted implementation, this creates a dependency on network connectivity into the company.

As an emergency solution, you can disable security on your Jenkins installation. I’d also get some sort of firewall rule (OS-based or hardware firewall) to restrict console access to a trusted terminal server or workstation. To disable security, stop Jenkins. Edit the config.xml file in $JENKINS_HOME, and ifnd the <useSecurity> section. Change ‘true’ to ‘false’ and start Jenkins. You’ll be able to access the console without credentials.

Updating Jenkins Image

General practice for updating an application is not to update a container. Instead, download an updated image and recreate the container with the new image. I store the container initialization command along with the folder to which image directories are mapped. My file system has /path/to/docker/storage/AppName that contains a text file with the initialization command and folder(s) that are mapped into the container. This avoids having to find the proper initialization parameters when I upgrade the container.

To update the container, pull a new image, stop the container, remove the container, and create it again. That is:

docker stop jenkins
docker pull jenkins/jenkins:lts
docker rm jenkins
<whatever you used to create the container>

Why *I* Didn’t Report

I tried to report, but I could not get anyone to TAKE my report. When I was in University, I had an undergrad assistant-ship. One of my responsibilities was overseeing work-study students who helped out in a computer lab. General management stuff – scheduling, sorting out coverage when someone couldn’t make their shift, determining when the lab was available to students, approving time cards. One kid falsified his time card — he’d clock in, leave, and come back a few hours later to clock out. Not making a shift wouldn’t be a problem, I dealt with that quite regularly and generally covered shifts myself if I couldn’t find someone looking to pick up a few extra hours. But asking to get paid for not working was unreasonable (also a crime. It wasn’t just defrauding the University, it was defrauding the Federal Work Study Program). My mental parade of horrors went something like this: chap gets charged with fraud, loses work-study funding, has to leave school, entire life is ruined over a stupid thing he’s done as a 19 year old kid. I wanted to help the guy, so I decided to be in my office before he clocked in. Check the lab every fifteen minutes or so and confirm that he didn’t just step out for a minute. Leave a note on his time card to see me in my office. And TALK to him about it — I know what you’re doing, it’s not right nor is it legal, and there are real ramifications if you persist and I have to report you. The kid was a big guy. Over six foot tall, built like a footballer. I asked a friend of mine to hang out in my office with me.

A few hours later, and I had evidence the dude was falsifying his time card: he came back to punch out. And came into my office as requested — all innocent-like with no idea what I could possibly have wanted to discuss. My friend, unfortunately, had gotten bored and decided to ring her sister about ten minutes before his shift was over.

My office had been a dark room — an important thing, when developing negatives, is to avoid exposure to light. Darkrooms have two rooms — open the first door, enter into the antechamber, kick on the red lights, close the outer door, then open the door to the processing room. I used the antechamber as a storage area, but there were tables and chairs out there too. The inner room was my actual office – desk, chairs, coffee maker. My friend took her call out to the antechamber because my discussion with the work-study student was going to interrupt her call. And closed the door.

So here I am, in a closed room, alone with the guy (a) that intimidated me and (b) with whom I had to have an unpleasant conversation. I explain that we’ve been checking the lab every few minutes and know that he never actually worked his shift. He could call it an emergency and say he came back to cross out the “in” time now that the emergency was sorted. Or he could clock out, and I’d have to report the fraud to administration.

He proceeded to sit on me and kiss me. I could not get up. I was stuck in a rolling office chair, where attempts to push with my feet just scooted the chair around the tile floor. I wasn’t a terribly weak girl, I could bench about 30 kg which was about average for my size. But there was no shoving this guy off of the chair. I was terrified, and in my mind a little angry that my friend — who really only needed to be there for like the last half hour of the guy’s shift — had decided to stay for the full three hours and didn’t care enough to actually help me during the period of time I actually wanted help. I was kissed and groped at for minutes before I was able to injure the guy enough that he fell off the chair. And I ran out to the antechamber. The guy was furious, but he wasn’t going to do anything with my friend standing right there, or with the antechamber door opened to the hallway. He stormed off.

But I was still terrified. I rang the police — even leaving aside sexual component of the assault, false imprisonment is a crime. Assault is a crime. I reported. And was directed to campus security because, for the price of a few new police cruisers and other “support” … evidently the city police do not respond to on campus crimes. Noise complaint, ring campus security. A flaming sofa out in the public street, ring campus security (although the fire department will eventually respond). Some kid assaults you and prevents you from leaving a room, ring campus security.

Well, you know what campus security has to say about sexual assault? There’s no evidence, I have no witness, it’s he said/she said. And it’s important that we be able to tell prospective freshman about the low level of on-campus crime. Including the zero rate of sexual crimes. So in addition to abject terror, humiliation, and eeeeewwwwww that I felt, I got to add in a heap of betrayal because, about a year ago, I was one of those prospective freshman getting the sales pitch. I remember hearing about the zero on-campus sex crime stat and wondering how that was possible. The students were still kids. You could find a keg party any night of the week. And whilst neither youth nor inebriation exonerate criminal action … they are certainly factors that contribute to it.

I told several friends — primarily because I did not ever want to be left alone with the guy, and I needed them to understand why. I’m telling the Internet 23 years later because of Trump’s comments about Dr. Ford not reporting. There are millions of different reasons assault victims haven’t reported the crime. None of those reasons mean the attack didn’t happen. None of those reasons mean the attack was anything other than horrifying. And sure I managed to move on. But I will never forget how the guy looked, or smelled, or the feeling of being restrained and assaulted.

So, Trump, #WhyIDidn’tReport … why I don’t have a police report to back up my assault is that the University paid off local law enforcement to ignore on-campus crime, and campus security had a vested interest in maintaining low crime stats. Doesn’t mean it didn’t happen, just means that the so-called justice system fails a lot of people. And, hey, isn’t that the sort of thing a the head of the Executive branch should be fixing?

 

There is a difference

Today it is Jr’s “harassment” note, decades ago it was the assertion that “harassment is an ugly guy trying to get some”, but the fact remains that *harassment* there is a whole spectrum of harassment. Some dude whipping it out on the lunch queue, that’s blatantly obvious harassment without me specifically asking the individual to keep their genitalia covered. But harassment can be subtler too. The harassment I experienced frequently at work (twenty years ago) was the kind that became harassment when the guy refused to stop. A coworker asking me on a date is not harassment; a fellow student asking me to go to a dance is not harassment. Asking a dozen times *IS* harassment. Grabbing at my person and telling me how much I’ll enjoy the date *IS* harassment. And throwing them down on a bed, and attempting to disrobe them whilst covering their mouth … that’s not just harassment, that’s assault. And battery. And likely false imprisonment.

Agile Methodology Is Not Anarchy

For the past several years, my employer has been moving toward an Agile development methodology. There are some challenges when mapping this methodology into operations because it’s not the same thing; but, surprisingly, those are not where I have experienced challenges. The biggest challenge during this transition is some of my coworkers seem to think the methodology is that there are no rules.

A friend of mine, a fairly eccentric history professor, used to say that a little knowledge is a dangerous thing but you’ve got to emphasize LITTLE. And it seems like we’re encountering a situation where Phil’s emphasis holds true: the only thing garnered from from Agile training is that the documentation and process from waterfall projects are no more. But breaking away from the large-scale view of a P.R.O.J.E.C.T. for Agile development is a bit like breaking a monolithic application out into microservices — it still needs to do all of the same ‘stuff’, it just does it differently. And there are still policies and procedures — even a microservice team is going to have a coding standard, a process for handling merges, a way of scheduling time off, and some basic idea of what their application needs to accomplish. Sure, the app’s design will change incrementally over time. But it’s not an emergent property like chaos/complexity theory.

Maybe the “what Agile means to me” mentality comes from failing to clearly map a development methodology into an operations framework. Maybe it’s just a good excuse to avoid components of work that they do not enjoy. To avoid “agile operations” becoming “no boring planning stuff!!!”, I’ve outlined ways in which the Scrum methodologies the company wishes to adopt can be used to streamline operations. It helps that our group is reorganizing into an operational/support group and an architecture/design group — I see a lot of places within the operations team where Scrum approaches make sense.

Backlog — prioritizing the ticket queue like a backlog and having support staff constantly pulling from the top of the list — not only is this an awesome way to avoid the guy who scans the queue for the easy jobs, but it ensures the most important problems are being resolved first. A universal set of stakeholders does not exist for the ticket queue — someone whose ticket is ranked fifteenth on the list may disagree, and they are welcome to add details explaining why the issue is more impacting that it seems on its face. But 90% or more of our tickets are “Sev3” — which basically means both “we want it done ASAP” and “it isn’t a wide-spread high impact outage”. Realistically, dozens of tickets do not have the exact same time constraint and impact. There is extra work for management in converting a ticket bucket into an ordered backlog, but the payoff is that tickets are resolved in an order that correlates to the importance of the issue. In addition to the ticket queue, routine maintenance tasks will be included in the backlog. And prioritized accordingly.

Very short sprints — while developers moving from Waterfall to Agile might start from a month (or two) long sprint and trim weeks as they evolve into the process, operations starts from the other end of the spectrum. Our norm is to grab a ticket, sort it, then look at the queue and grab another one. We are planning for hours, maybe a day or two. This means we might establish application access on Tuesday that isn’t needed until next Monday. Establish a sprint that lasts a week, and use the backlog to get tickets that have lower priority (either because the impact is lower or because resolution is not needed for a week) included in the sprint. Service interruptions, SEV1 and SEV2 tickets, will occur and should be assumed in the sprint planning (i.e. either take enough work that you think it will just get done with no service interruption tickets and accept that some tickets from the sprint will be incomplete or leave some space for service interruption tickets and have staff pull “bonus” tickets from the top of the backlog if they have no work toward the end of the sprint).

Estimation — going through the tickets and classifying each incident as a quick little task, something that will take a few hours, or a significant undertaking facilitates in sprint planning. It’s difficult to know how many tickets I can reasonably expect to include in a sprint if I cannot differentiate between a three minute config change and a three day application rollout.

Multi-tasking — Implementation, support, and ticket resolution tasks are no longer a big bucket of work that individuals attempt to multi-task to complete. There are distinct tasks that are completed in series. Some tickets require information from the user; put the ticket on hold until a response is received and move on to the next unit of work.

Velocity — historic data based on time estimates cannot be generated, but simple number of tickets per week pre- and post- can certainly be compared. And going forward, ticket counts can be weighted by estimation values.

Stand-ups are a bit of a mental sticking point for me. I can conceive the value of spending a few minutes reviewing what you’ve done, what you plan on doing, and ensuring there is a ready forum to discuss any sticking points (maybe someone else has encountered a similar situation and can offer assistance). Stand-ups could include a quick discussion of any priority shifts (escalations, service interruptions) too. *But* my experience with stand-ups has been the attendance test variety — stand-ups that were used to hurt individuals who didn’t make it to the office by 08:00. Or those who weren’t around at 16:50. I don’t think it’s reasonable to ask someone who got into an issue and worked until 7P to show up at 8A the next day. I also don’t think it is reasonable to expect someone who came in at 6A to continue working until 5P. Were a stand-up scheduled in the middle of the day, I might feel differently about them.

Red Herrings

To everyone discussing whether a 17 year old kid who sexually assaults someone at a party (whilst high/drunk/whatever) should have said event preclude them from [promotions, government service, appointment to the Supreme Court] … that’s a red herring. The question is if someone who lies to Congress under oath (possibly repeatedly) should be confirmed to the Court. And that would be a resounding NO regardless of the individual’s politics.

There’s a big difference between elusive “I do not recall X” testimony where you’re not denying the action itself but rather recollection of said action and “I did not do X”. When someone pretty convincingly testifies that you did do X. Be X receiving stolen e-mails or sexually assaulting a woman … well, there are *lots* of things I’ve done but do not recall (although I’m not sure what kind of life you lead when stolen e-mails to advance judicial nominations are so every day that they simply slip your mind). But outright denying it happened?!?

History Without Context

There’s a challenge in teaching history to young people — whilst it is not good to proceed through life ignorant of what has come before you, there are facets of history that are simply incomprehensible to a five year old kid. Explaining why some people are afraid of the police, describing the point of the military … it is a snarl of sociological and political facts, individual experiences … there’s a good and a bad side, but it is difficult to understand points of view without the entire history that created that point of view (a bit like coupling Zinn’s People’s History with Johnson’s History of the American People and calling that a balanced history lesson). I used to advocate for the inclusion of fictional works in University history classes — while the story itself may not be true, fictional works provide a picture into the reality of the time. History provides a context for books, and books provide a context for history. Arthur Miller was not randomly enamored with the Salem witch hunts.

Sadly, Anya’s teacher has begun down the path of history without context. Today (why not yesterday!?!) she taught the kids that “bad people” crashed planes into buildings in DC and NYC, as well as PA. Which left me to try explaining that it’s not like half a dozen people woke up one morning and thought it might be a lark to try flying an aeroplane … only to find it wasn’t as easy as it looks on TV. It was an organized group executing a plan. It was also a group organized partially because of terrible things done across the globe. A cause can be just without justifying any action taken in support of the cause. The validity of a cause doesn’t make the action right any more than “he hit me first” makes slugging your brother right.

A lot of nation-states, countries, and people have done a lot of terrible things to one another in the name of just causes … the events of which the teacher spoke is an egregious example.