Category: Coding

Visual Studio Code

We found a free, open source code editor from Microsoft called Visual Studio Code — there are downloadable modules that include formatting for a variety of programming languages (c#, cpp, fortran), scripts (perl, php), and other useful formats like MySQL, Apache httpd config files. It also serves as a GUI front end to git. And that is something I’ve been trying to find since I inherited a git server at work — a way for people to avoid having to remember a dozen different git commands.

Git

I mentioned that I had inherited a Git implementation last week. Here is the documentation I created to teach my coworkers what Git is and how to use it. Some isn’t applicable outside of our environment (you won’t care about the AD groups that control access to the system), some is applicable for small non-dedicated development teams … but I figured I’d post the presentation and quick reference guide on the Internet in case it was useful to someone else.

Background:

Git is a system that provides version control for files – we’re using it to control script/program code versions (source control management), but I could put this document in Git and use the version control to manage edits to the document. You can use it to maintain configuration files – allowing config changes to be traceable. You could use it as a cookbook if you were so inclined – a chef tinkering with a recipe might be interested in going back a few versions and trying something else.

Git provides some functionality that is redundant to other systems – you could, for instance, import our scripts to SharePoint and make code changes within SharePoint. The individual replacing the file is recorded. If a previous version is needed, SharePoint maintains previous versions that can be recovered. Why use Git instead of SharePoint? Git makes it easier to have multiple developers working on a program, including functions to “merge” the edited files together. You can have different versions of the whole project – in SharePoint, I can see different versions of each file, but I have no way of correlating which version of file x.cs goes with y.h

If you want the history, LMGTFY 🙂 Or, you know, read WikiPedia. LT;DR: It’s one of Torvalds’s projects, initially used for Linux kernel development and has since become a widely adopted source control management platform. If you have ever looked at a project on GitHub, you have seen a little bit of Git. GitHub is a massive, public Git repository. Because Git has significant adoption within the OpenSource community, there are a lot of good documents on its internal mechanisms (https://book.git-scm.com/book/en/v2/Git-Internals-Git-Objects for example, if you are interested in how data is stored), how it is used (Google “git cheatsheet” and there are thousands of them, or full books like https://git-scm.com/book/en/v2), and oddball errors that might crop up.

 Implementation

We have a Bobobo Git server using Active Directory for both authentication and authorization. The server source is available on GitHub (https://github.com/jakubgarfield/Bonobo-Git-Server) where you can see issues and be included in conversation about source updates (subscribing lets you know when new versions should be available for install).

Questions and bugs regarding the program are maintained in the GitHub issues section. The Google forum that may come up in searches is not active and was retained for history.

This brainshare is primarily to show client-side usage of the Git server. Server setup, configuration, and management is not the focus. One thing I will highlight on the server config: the groups used to provide authorization are not preexisting or nested groups. This means new team members will need to be added to the appropriate “Windstream CSG Git …” group to use the server.

<add key=”ActiveDirectoryMemberGroupName” value=”Windstream CSG Git Users” />

<add key=”ActiveDirectoryTeamMapping” value=”VDI=Windstream CSG Git VDI,SharePoint=Windstream CSG Git SharePoint, Directory Design=Windstream CSG Git Directory Design”/>

<add key=”ActiveDirectoryRoleMapping” value=”Administrator=Windstream CSG Git Admins” />

The above snippet is from the Web.config file located on the server at F:\inetpub\www\CSG – if new groups need to be added to the Git server, that is where the magic happens.

Some Terminology:

A repository is a storage location. It can store one file, it could be a whole bunch of files that make up a single program (e.g. the CSOCheck Visual Studio project), it could be a bunch of independent programs that have similar purposes (e.g. ‘Provisioning’ that holds all our provisioning scripts). A repository could be all our code glommed into one place (don’t do this – it makes maintaining an individual program more difficult).

A branch is another server-hosted copy of the project. You don’t want to directly edit the in-use production code (we do but this is certainly not a programming best practice!) – a branch is a copy on which development is done. Once development has been completed, the branch is merged back into the master copy. Looking at Git with small projects and a small number of developers, I wouldn’t expect to see a lot of branches on a project. A large program with a lot of dedicated developers may have some break/fix branches as well as longer term feature enhancement branches.

A fork is a personal copy of a repository. In OpenSource development, forks avoid making changes in someone else’s repository. You create your fork, work within your copy, then offer the changes in your fork for inclusion in the project. We don’t have much need to create forks — we would create a branch within the project.

Project – Bonobo does not seem to have projects, but other Git implementations do. A  project includes the repository, an issues log, pull requests, and sometimes even a Wiki for the application. If you see someone referring to a project, for us that is just the repository.

The project maintainer is the individual who “owns” the project – this isn’t a project sponsor (a non-tech individual who owns a business relationship) but a technical supervisor for the development work who may also have project sponsor define-requirement type roles. The maintainer decides if changes and features are added. You can suggest changes or features – in OpenSource projects, review the existing issues to see if the feature was already requested (and make a new issue to request the feature if one does not exist) before spending a lot of time working on code that will not be accepted. Your idea may be something people are excited to see included in the project. Or it may be something they don’t want (you can always make a fork and add the feature to your iteration of the project). Even a bugfix – your proposed solution may be accepted. Or there may be a reason the maintainer wants to use a different approach to the issue. We do not have project maintainers.

Commit – this is basically making changes to the branch (add a file, delete a file, or modify a file). A commit should represent a single change. By that, I don’t mean every time you change a line, make a commit. You may well have to update a hundred lines of code across five different files to resolve an issue or implement a feature. But it’s just *one* issue or feature being implemented in the commit. You shouldn’t have a commit that implements SSL encryption in LDAP authentication and allows individuals to approve requests for direct reports. These two things have nothing to do with each other, even if they happen to be the two cards you’ve worked on today.

Commit messages associated with commits where you can indicate what is being changed in the commit. A “good” commit message is like well commented code – don’t provide too much info like “I added XYZ to line 81 on file abc.def, but don’t write “Bug fixes” either. A commit message should convey what has been changed without someone having to diff the versions (i.e. saves time). In more formal software development, commit messages also aid in the creation of release notes. Something like “Changed new user template to include ourOrgPerson objectClass” provides enough detail that we can tell what the commit did – if someone wants to find out what lines got edited, they can diff the files and tell. You can view the commit history in the web site or by using “git log”.

Push is the process of updating the server repository with changes you have made on your local repository.

Pull Request is a term you may encounter when reading Git documentation or participating in GitHub. The request basically clue someone into the fact you’ve got code to be reviewed or integrated into an upstream branch. The project maintainer would, once the changes had been reviewed and agreed upon, merge the feature into the repository and close the pull request. This is not a process we are following, nor are code-related discussions or issue lists tracked within the Git server.

Deploy – once the pull request has been approved, you can deploy and test the changes. If the changes do not work, you roll back by re-deploying the existing master.

Merge is used to combine an individual’s local repository with a server-housed copy of the branch or to combine two branches.

GitHub is an Internet based Git repository used by a lot of people and a lot of OpenSource projects. Projects are publicly readable (well, projects held in free accounts. There’s an add-on fee that allows you to maintain private projects). Yes, we could just get Git enterprise licenses and use the hosted service. We elected to deploy an internally hosted and maintained server.

Process Flow:

In a simple development environment like we have (we’re not dedicated programmers working on enormous applications), branches are straightforward. We’ve got a master for a project. When we encounter a problem, or wish to expand functionality, we make a working branch. Sort the issue or add the feature, commit. Deploy and test the code, then merge your working branch back into master.

If there is only one person working on a project at a time, merge conflicts are not really a thing. We don’t have ten different branches, we don’t have branches from branches (e.g. a branch for implementing external authentication which then has a branch for LDAP, DB table, and external authentication providers. Then the external authentication provider branch has a branch for Facebook, .NET, and Google authentication providers.). When you have a tree full of branches, you need to resolve merge conflicts (pick which change makes it) before you can merge your pull request branch.

A lot of “how to use Git” is process-related and not technical how-to stuff. Questions like “what are your software development lifecycle management processes?” and “What are your criterion for creating a new branch?”

When I worked with full-time developers, we created an “EmerStaging” branch for development on critical incidents and a “DevStaging” branch for development on non-critical incidents. The EmerStaging branch was only intended to be around for a few days – the branch would start out identical to the master, whatever big deal issue would be sorted, then the branch merged back into the master. These changes would then be sync’d down to all other branches (we don’t want the bug to impact development or, worse, to be reintroduced as someone merges in their long-term development project). The DevStaging branch was always present – there’s always a backlog of lower priority bug-fix type stuff to be done – and the project maintainer would ensure the downstream branches were updated when they processed pull requests. In addition to these break/fix branches, a new branch existed for the long-term development work – next version application or specific new features that had not been assigned to a specific release iteration.

Our environment is not so complex – we should be able to get by with one development branch when there is active development on a project and only the master branch when changes are not being made. Following this process, we avoid the challenges of synchronizing and merging multiple branches and sub-branches.

The Git Client

Simply put, a Git client puts files on the local disk and pushes those files back to the server. The first step is getting a git client installed. The examples I am showing today are using the CLI utilities from https://git-scm.com/download/win simply because I already use them at home (it’s the version . Yes, there are other git clients. Lots. If you have used a different client that you prefer, go for it. Different clients will not corrupt a repository.

Some IDE’s have Git integration – their own Git client – it may or may not work with our implementation (some are specific to GitHub / GitHub Enterprise which is not the same thing). If you are using an IDE, it may be convenient to research integrating your IDE directly with Git. There is no need – you can use the command line utilities to retrieve files, switch over to the IDE for your development work, and then use the command line utilities to add, commit, and merge your changes.

To install the Git-SCM clients, download and run the installer. Selecting the defaults on the installation are sufficient – although if you do not have the Win32 port of the GNU utilities, you can select the third option to get grep and such in DOS.

Once the installation completes, grab the two files from \\CWWAPP695\c$\Program Files\Git\mingw64\ssl\certs and put them into your install path\git\mingw64\ssl\certs folder (I renamed the existing ones, but there’s no reason not to delete them). If you see the error “SSL certificate problem: unable to get local issuer certificate”, re-read the last sentence and try again.

Identify a folder on your computer into which you want to clone projects. You can store different projects in distinct locations or you can have a top-level folder in which all your projects are housed.

Creating A New Repository

Log into https://csggit.windstream.com using your Active Directory username and password (no need to specify domain). Repositories are sorted into groups – a group may be a single application project. For example, “AD Password Filter”. A group may contain several different application projects – for example, “Auth Samples”.

To create a new repository, click the big blue button in the upper right-hand corner that says exactly that. Provide a name for the repository – this cannot contain spaces, but should be descriptive enough that people do not need to actually read through the code to see what the program does. I am making a project called “HelloWorld” because … tradition.

Supplying a group name will sort the repository into a group on that first page – please do this, even if your group is your program. Otherwise it’s like creating all of your files in one folder … fine for a small number of files, but quickly difficult to look at. We may want to make a Misc group to hold oddball one-off programs.

The description field provides a place for freeform text describing the purpose of the program. This doesn’t have to be long, but it would be nice to have something. We can consider adding the server(s) to which the code is deployed – that would provide a quick way to list our scripts, what they do, and where they run.

Contributors can clone, push, and pull a repository. Administrators are additionally able to edit the repository details (i.e. change the stuff we’re putting in here now) and delete the repository.

Select the team(s) which will need to access the repository. Disclaimer – most of my experience with Git is at home using a GitLab server. There are only two of us, so permissioning isn’t really a concern. Not sure exactly how secure this is (i.e. if I don’t select Directory Design, can they still view the source but cannot write to it? Do they not even see the repository? I’d not interested enough to get another ID and add it into the security group, but if someone wants to test now … that would be cool.). Click ‘Create’ and the repository will be created.

Look near the top of the page – there will be a hyperlink to go to the new repository. Click that.

We will need the “General Url” as we begin working with the repository (i.e. copy the link address now).

Working With Your Repository

Now that you have a project and URL, clone the project to your local repository – if this is a new project, ignore the warning. If this is a project you expect to have some existing content … well, don’t ignore the error:

D:\tempCSG\ljr\Git>git clone https://csggit.windstream.com/CSG/LJR.git

Cloning into ‘LJR’…

warning: You appear to have cloned an empty repository.

 

If you are using the Git credential manager, you will be asked to authenticate to the server the first time you clone a repository. You do not need to specify the domain. When you change your password, you can use the Windows Credential Manager to edit your stored credential.

Once the connection has been authenticated, the client will clone the repository and volia, we have stuff

D:\tempCSG\ljr\Git\LJR>dir

Volume in drive D is Data

Volume Serial Number is FA7B-B3E4

 

Directory of D:\tempCSG\ljr\Git\LJR

 

06/26/2017  02:33 PM    <DIR>          .

06/26/2017  02:33 PM    <DIR>          ..

0 File(s)              0 bytes

2 Dir(s)   9,644,441,600 bytes free

 

OK, that wasn’t a whole lot of stuff – it just created a folder for my application! Make some files in there – that may mean using the folder as your IDE project location. It may mean using notepad and making a new file. Whatever your approach, make a new file and add some code.

D:\tempCSG\ljr\Git\LJR>notepad helloworld.pl

Then add the new file(s) to the local git repository – important bit here, we are currently making changes to our copy. If you check the server, it is still an empty project.

D:\tempCSG\ljr\Git\LJR>git add *

The * is a wildcard – if you are working on a larger project, you can add just the files you are updating (i.e. I could use git add helloworld.pl here). I used the wildcard here because a lot of people like the convenience. Personally, I always recommend programmers add by name to ensure they are adding the proper ‘stuff’ to the project. There are other short-cut add options: git add . will stage new and modified files (not deletions), git add -u will stage modified and deleted files, and git add -A will stage all files.

Then commit – since this is the first file, I am not using a great commit note. Generally I’ve recommended making the first commit note a link to the requirements document … basically if I wanted to find out why we’ve got this program and what it is meant to do, where do I go? In our case, this might be an INC # or a SharePoint URL. Or it might just be a freeform text like “Provision DMZAD group memberships from acildsdb:OSR2.CWSODMZTable”.

D:\tempCSG\ljr\Git\LJR>git commit -m “Created project”

[master (root-commit) 7a66c68] Created project

1 file changed, 2 insertions(+)

create mode 100644 helloworld.pl

 

Check the web view to see what’s in the project: nothing. Push the changes:

D:\tempCSG\ljr\Git\LJR>git push

Counting objects: 3, done.

Writing objects: 100% (3/3), 256 bytes | 0 bytes/s, done.

Total 3 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

* [new branch]      master -> master

I mentioned earlier that making updates in the master branch is not a best practice … the first time around is an exception … there’s no production implementation that you’re going to bugger up. Now that we’ve got a project that’s running in production (pretend), we’ll make a branch when we want to make changes. Check out the branch – this changes your git ‘context’ to the new branch.

D:\tempCSG\ljr\Git\LJR>git branch newEdits

D:\tempCSG\ljr\Git\LJR>git checkout newEdits

Switched to a new branch ‘newEdits’

Yes there is a shortcut to doing this – “git checkout -b newBranchName”. Push the new branch

D:\tempCSG\ljr\Git\LJR>git push origin newEdits

Total 0 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

* [new branch]      newEdits -> newEdits

 

Make some more changes and add the changed file(s) to the local repo

D:\tempCSG\ljr\Git\LJR>notepad helloworld.pl

D:\tempCSG\ljr\Git\LJR>git add helloworld.pl

Commit the changes and push to the server:

D:\tempCSG\ljr\Git\LJR>git commit -m “Added international support”

[newEdits 28365d9] Added international support

1 file changed, 8 insertions(+), 1 deletion(-)

D:\tempCSG\ljr\Git\LJR>git push origin newEdits

Counting objects: 3, done.

Delta compression using up to 2 threads.

Compressing objects: 100% (2/2), done.

Writing objects: 100% (3/3), 335 bytes | 0 bytes/s, done.

Total 3 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

7a66c68..28365d9  newEdits -> newEdits

 

Now if you look @ the repository browser on the web site, https://csggit.windstream.com/CSG/Repository/LJR/newEdits/Blob/helloworld.pl, you will see the additions we’ve made. Add some more and repeat the process.

D:\tempCSG\ljr\Git\LJR>notepad helloworld.pl

D:\tempCSG\ljr\Git\LJR>git commit -m “Added Swedish and Hungarian greetings”

[newEdits b2cbedd] Added Swedish and Hungarian greetings

1 file changed, 2 insertions(+)

 

D:\tempCSG\ljr\Git\LJR>git push origin newEdits

Counting objects: 3, done.

Delta compression using up to 2 threads.

Compressing objects: 100% (2/2), done.

Writing objects: 100% (3/3), 340 bytes | 0 bytes/s, done.

Total 3 (delta 1), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

28365d9..b2cbedd  newEdits -> newEdits

Now look at repository explorer and see how the history is tracked – look @ each commit (notice the commit messages and who made the changes). Click into previous version and see how the differences are tracked.

Fast Forward Merging:

This is possible for simple projects like we’re using – there is a master, a branch for changes, then that branch gets collapsed back into the master when the changes have been finished.

D:\tempCSG\ljr\Git\LJR>git checkout master

Switched to branch ‘master’

Your branch is up-to-date with ‘origin/master’.

 

D:\tempCSG\ljr\Git\LJR>git merge newEdits

Updating 7a66c68..08931d5

Fast-forward

helloworld.pl | 12 +++++++++++-

1 file changed, 11 insertions(+), 1 deletion(-)

 

D:\tempCSG\ljr\Git\LJR>git push

Total 0 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

7a66c68..08931d5  master -> master

 

Check web site – you’ll see your changes in master. But newEdits branch is still there.

D:\tempCSG\ljr\Git\LJR>git branch

* master

newEdits

My recommendation is to collapse the branch (delete it) when you have completed your changes. Otherwise you need to manage branches and merges. If there’s a need for multiple branches of sustained development … that’s beyond the scope of a quick brain share. You can find information on more complex merging operations, including conflict resolution (https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging#_basic_merging) and rebasing (https://git-scm.com/book/en/v2/Git-Branching-Rebasing). Google can also tell you the ongoing debate about etiquette around creating new branches, merging, and rebasing.

To delete a branch once development has been completed and the changes have been merged into master:

D:\tempCSG\ljr\Git\LJR>git push origin –delete newEdits

To https://csggit.windstream.com/CSG/LJR.git

– [deleted]         newEdits

 

D:\tempCSG\ljr\Git\LJR>git branch -d newEdits

Deleted branch newEdits (was 08931d5).

Notice the commit history / notes were copied from the newEdits branch into the master, so we haven’t lost anything by merging our branch into the master.

Your local repository is not automatically updated with changes other people commit to the project. A pull retrieves changes pushed by others to the Git server. Alternately , fetch and merge operations to download the changes and play those changes into your local repository.

Since we are not full-time developers, we might opt not to persistently store projects locally (i.e. we have a specific program that needs to be updated, clone the repository locally, perform the edits, push and merge these edits, then destroy the local copy). Provided two people are not simultaneously working on the same project, the newly cloned project is up-to-date each time you start working on a program.

Stashing Changes

If you are working on a particular branch but not yet ready to commit your changes – and you have a need to work on some else in the previous commit – use “git stash save” to table the changes you’ve currently made. Make whatever changes you need to make, add those changes, commit them, and then use “git stash pop” to return the tabled changes.

Getting Rid Of Stuff

The first question is should you remove something? We often keep old code around for future reference (you want to do something similar, instead of re-writing the whole thing … copy this old program and tweak it for the current need). But leaving every old bit of code in the repository is a bit like never deleting an e-mail message or document on disk … eventually you’ll have a big mess of useless stuff that you’re looking through and backing up.

You could change the repository group to “Archive” (or “zzArchive” so it sorts to the bottom of the web view) – this would retain the code but sort it out into a different logical container to identify it as no longer used code.

Some companies will set up a second Git server dedicated to archive – lower I/O requirements on hardware, not frequently backed up, etc. Old code is pushed up to the archive server and then deleted from the active code server. As we don’t currently have an archive Git server, this isn’t an option. But it is a possibility if inactive code that we want to keep becomes burdensome. Other companies archive the code outside of Git and delete the project from the repository.

To remove a file from the repository, use “git rm filename.xtn”. To remove a repository, you can click the little rubbish can next to the project on the web site.

There is no such thing as removing a group – as soon as no repositories exist in the group, it will disappear from the web view.

A Note On Binary Files

The typical solution to storing large binary files in Git is to implement LFS – this feature  is not yet supported in Bonobo. As such, avoid storing binary files in Git (media, compiled binaries, compressed data). They tend to be large. Because of the distributed nature of Git, large files end are transmitted and stored a lot of places. Frequently changing binary files bloat the server database too. This isn’t to say you cannot store binary files – just that it is a judgement call. Smaller and more static files, great. Three gig files that get updated daily … find another solution.

Many types of binary files do not compress well – especially already compressed files. You can disable delta compression in .gitattributes (*.mp4 binary -delta) to avoid the I/O of attempting to compress already-compressed data.

When merging binary files, diff just tells you they are different. Not particularly illuminating information if you are manually resolving merge conflicts. For non-text files, there may be a filter that allows changes to be represented in a readable format (e.g. Microsoft Word documents like this one) by setting an appropriate filter in .gitattributes (*.docx diff=word). The diff would not include format changes (i.e. if I bolded a specific sentence, that would not be apparent in the diff), but it will display text content that has been updated.

Remote Repositories

The whole point of Git is distributing copies of the repository elsewhere. It is possible to use Git locally – this would allow a single developer to track and revert changes – but typical implementations have multiple developers pulling from and pushing to a remote repository.

You can have more than one remote repository.

D:\tempCSG\ljr\Git\PKIWeb>git remote -v

origin  https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (push)

You may notice that we have the word origin in some of the commands – this is a default repository created when we clone the branch. You can add additional remote repositories (the example I am using is silly since they are the same location). This could be done to transfer a project to a different repository (moving an out-of-support product to an archive Git server or an acquired company moving repositories into the new company’s repository) or to pull a project from an alternate location (other users who maintain their own project for the same application).

D:\tempCSG\ljr\Git\PKIWeb>git remote add ljr https://csggit.windstream.com/CSG/PKIWeb.git

D:\tempCSG\ljr\Git\PKIWeb>git remote -v

ljr     https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

ljr     https://csggit.windstream.com/CSG/PKIWeb.git (push)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (push)

Normally there’s no need for us to do this (i.e. don’t maintain your own copy of a project, create a branch in the existing one!), except if our project is derivative work of an opensource project that we need to publish externally. You could have both the internal Git server and GitHub registered as repositories. Make your changes and do “push origin” as well as “push whateverYouCallGitHub”.

You can also “fetch origin” and “fetch whateverYouCallGitHub”, but to avoid confusion, I would use the internal Git server as the authoritative repository (anyone else in the group may be editing the code) and only push to GitHub.

When you no longer need a remote repository, you can remove it.

D:\tempCSG\ljr\Git\PKIWeb>git remote rm ljr

 

D:\tempCSG\ljr\Git\PKIWeb>git remote -v

origin  https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (push)

README

If you participate in GitHub projects, you will notice a README.md file at the root of projects. This is a standard place to include documentation (hence the name), but it is also rendered out in the Git server web site. For an example, see the AD Password Filter project (https://csggit.windstream.com/CSG/Repository/ADPasswordFilter/master/Tree). If there is not a convenient external reference for the initial commit notes, you may want to consider including program documentation in the README.md file.

What if the changes don’t work?

One nice Git feature is undoing changes. The first thing you need to know is that commits have ID numbers (often called a SHA in documentation). You can find that using “git log” or by looking at the web site.

If the changes are local but haven’t been committed to the server, just reset your local copy: git reset –hard ID#

If the changes haven’t been merged into the master branch yet (i.e. you clone your dev branch to the script server, test it … then realize that won’t work), use the git revert functions. First find the commit ID. Then use “git revert ID#” and git will create a commit that is the inverse of the commit specified (it undoes whatever the commit does). Don’t forget to push this revert back to the server.

If the problem is just the commit message, you can modify the message (i.e. remove a typo): git commit –amend -m “This is my new commit message”

You can temporarily revert to a specific commit version (say, to see if the problem you are having was introduced in this version) using “git checkout ID#”. If you intend to make changes from the old state, use “git checkout -b previousState ID# ” to create a new branch from that point.

Ingesting Existing Code

Create the repository. In the directory with your existing code, initialize the directory as a git repository. Add all files to the local repository and commit the initial file load.

D:\Scripts\ljl\wincare-oud>git init

Initialized empty Git repository in D:/Scripts/ljl/wincare-oud/.git/

 

D:\Scripts\ljl\wincare-oud>git add *

 

D:\Scripts\ljl\wincare-oud>git commit -m “Uploading existing code to project”

[master (root-commit) 231bafa] Uploading existing code to project

2 files changed, 76 insertions(+)

create mode 100644 _simulateWincare.pl

create mode 100644 res.txt

 

Add a remote location repository (you can use “git remote -v” to confirm the repository has been added) and push the local repository to the remote

D:\Scripts\ljl\wincare-oud>git remote add origin https://csggit.windstream.com/C

SG/WinCareOUDTesting.git

 

D:\Scripts\ljl\wincare-oud>git push origin master

Counting objects: 4, done.

Delta compression using up to 2 threads.

Compressing objects: 100% (3/3), done.

Writing objects: 100% (4/4), 1.06 KiB | 0 bytes/s, done.

Total 4 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/WinCareOUDTesting.git

* [new branch]      master -> master

But wait …

We’ve got a whole bunch of code written and stashed somewhere … but how does that deploy it? It doesn’t. For compiled code, there would be a build process that follows the commits. Someone like a build manager (or an automated process) takes the updated source code, compiles it, hands it off for testing (may be manual testing by QA people or may be an automated test program), then supplies the compiled binaries for release or deployment.

With our interpreted code, using Git is a process change. Instead of going to the task server, copying the script file to something-ljr.xtn, editing my copy, testing, then moving my copy back to something.xnt – we would branch the master for development, clone the development branch to our workstation or elsewhere on the terminal server, make changes, test, commit and push those changes, then merge the development branch back into master.

Once the branch has been merged into master, use git on the task server to integrate changes. (The shortcut below can also be done as “git fetch origin master” and “git merge master”). I am assuming that fast-forward merges can be done.

D:\Scripts\ljl\wincare-oud>git pull origin master

From https://csggit.windstream.com/CSG/WinCareOUDTesting

* branch            master     -> FETCH_HEAD

Updating 231bafa..202da14

Fast-forward

_simulateWincare.pl | 10 +++++++++-

1 file changed, 9 insertions(+), 1 deletion(-)

On next script execution, the updated code will be used.

Etiquette

There are guidelines to contributing to OpenSource projects (https://opensource.guide/how-to-contribute/) – if you will be working on public projects, read the guidelines and engage with the other developers. Individual projects may have their own guidelines – Git itself is an OpenSource project on GitHub, but pull requests with the obvious repository (named Git) are ignored.

Here, we all know each other … if you see a ticket that requests a new column in a report or a different format for an export, make a development branch, sort the issue, test it, and merge the development branch back into master.

There is one part of the OpenSource guidelines that produce more readable code when multiple individuals are contributing: coding standards. Software development teams have formal documents that define all manner of form within their coding. How to name variables. Are spaces or newlines used before braces? Are spaces used before parenthesis? How are functions named? What does a program or function comment block look like? How are variable and function names cased? When looking at OpenSource projects – or our internal team code – there isn’t a single coding standard. In the absence of a company-supplied standard, most individuals have one of their own. From a class, from a previous job … something.

Some people prefix variable names with type indicators (in statically cast language, you’ve got to search up to the variable declaration otherwise). Some people appreciate concise code and write if(x == y){ doWhatever; } all on one line, others would consider that hopelessly unreadable. Some people use switch statements, some hate them and would rather long-form the if/elseif/else version. If you are making a quick change (+2 needs to be +4 or some word was misspelt), you don’t need to review the code to see how it is written. Anything beyond a quick edit, it is polite to look at how the project maintainer (or original author in our case) has written the code and follow their form.

 

Git Deployment

I ‘inherited’ the Git server at work — which means I had to learn how the back end component of Git works (beyond my file-system based implementation where there are just clients and a disk location). It is not as complicated as I feared. The chap who had deployed the Git backend at work chose Bonobo — since he no longer works for the company, I cannot just ask why this particular implementation. It’s Windows based and priced in our 0$ budget, and I am certain these were selling points. It seems quite stripped down compared to GitHub too — none of the issue tracking / Wiki / chat about it features. Which, for what my department does, is fine. We are not software developers. We have a lot of internal code for task automation, we have some internal code for departmental web sites, and we have some sample code we hand out to other developers (i.e. someone wants to start using LDAP or ADFS authentication, we can give them a sample implementation in their language). There aren’t feature requests. Generally speaking, there aren’t simultaneous development tasks on a project.

Since I deciphered the server implementation at work, I wanted to set up a Git server at home too. The limited feature set of Bonobo was off-putting. I wanted integrated issue tracking. Looking at the available opensource and free options, I selected GitLab. As a sandbox — poke around the server, see how it works and what features it offers — I wanted something ready-to-go. I noticed that there is a Docker container for the project. I helped a few friends who were testing Docker as a development and deployment methodology (I’ve even suggested it for my employer’s internal development staff … being able to develop and run an application with an integrated web server *without* needing the Windows permissions and configuration for a web server (and doing it all over again when your computer is replaced) seemed efficient. But I’d never actually used a Docker container before. It is incredibly easy.

Install docker — a bit obvious, but that was the most time consuming part of the process. I elected to install it on my Windows laptop for expediency. If we decide not to use GitLab, I haven’t thrown a bunch of unnecessary binaries on the server. Lenovo, as a default, does not enable virtualisation. Getting into the BIOS config tool (shift then click the power button, keep holding shift whilst you click restart) was the most time consuming bit of the installation.

Once Docker is installed, pull the container from the Docker store (docker pull gitlab/gitlab-ce). Then run it (docker run –detach –hostname gitlab.rushworth.us –publish 443:443 –publish 80:80 –publish 22:22 –name gitlab –restart always –volume /srv/gitlab/config://c/gldata/etc –volume /srv/gitlab/logs:/var/log/gitlab –volume /srv/gitlab/data://c/gldata/data –volume /svr/docker/gitlab/gitlab://c/gldata/gitlab gitlab/gitlab-ce:latest). You can remap ports (e.g. publish 8443:443) if needed.

Not quite there yet — you’ve got to edit the container config (docker exec -it gitlab vi /etc/gitlab/gitlab.rb) for your environment. Set a valid external url (external_url ‘http://gitlab.rushworth.us’). I also enabled LDAP authentication to test that out.


gitlab_rails[‘ldap_enabled’] = true

###! **remember to close this block with ‘EOS’ below**
gitlab_rails[‘ldap_servers’] = YAML.load <<-‘EOS’
main: # ‘main’ is the GitLab ‘provider ID’ of this LDAP server
label: ‘LDAP’
host: ‘ADHostname.rushworth.us’
port: 636
uid: ‘sAMAccountName’
method: ‘ssl’ # “tls” or “ssl” or “plain”
bind_dn: ‘cn=UserID,ou=SystemAccounts,dc=domain,dc=ccTLD’
password: ‘AccountPasswordGoesHere’
active_directory: true
allow_username_or_email_login: false
block_auto_created_users: false
base: ‘ou=ResourceUsers,dc=domain,dc=ccTLD’
user_filter: ‘(&(sAMAccountName=*))’ # Can add attribute value to restrict authorized users to GitLab access, we leave open to all valid user accounts in the OU. Should be able to authorize based on group membership using linked attribute value like (&(memberOf=cn=group,ou=groupOU,dc=domain,dc=ccTLD))
attributes:
username: [‘uid’, ‘userid’, ‘sAMAccountName’]
email: [‘mail’, ’email’, ‘userPrincipalName’]
name: ‘cn’
first_name: ‘givenName’
last_name: ‘sn’

EOS


The default is to retain a lot of log files — 30 days! This might be reasonable in a corporate environment, but even for production at home … that’s a lot of space dedicated to log files.


logging[‘logrotate_frequency’] = “daily” # rotate logs daily
logging[‘logrotate_rotate’] = 3 # keep 3 rotated logs
logging[‘logrotate_compress’] = “compress” # see ‘man logrotate’
logging[‘logrotate_method’] = “copytruncate” # see ‘man logrotate’


And finally configure SMTP for outbound mail. We don’t use authentication on our SMTP server; it controls relay based on source IP. We do use starttls, but the certificate is not going to be trusted without additional configuration … so I set the ssl verify mode to none.


gitlab_rails[‘smtp_enable’] = true
gitlab_rails[‘smtp_address’] = “smtp.hostname.ccTLD”
gitlab_rails[‘smtp_port’] = 25
# gitlab_rails[‘smtp_user_name’] = “smtp user”
# gitlab_rails[‘smtp_password’] = “smtp password”
# gitlab_rails[‘smtp_domain’] = “example.com”
# gitlab_rails[‘smtp_authentication’] = “login”
gitlab_rails[‘smtp_enable_starttls_auto’] = true
# gitlab_rails[‘smtp_tls’] = false

###! **Can be: ‘none’, ‘peer’, ‘client_once’, ‘fail_if_no_peer_cert’**
###! Docs: http://api.rubyonrails.org/classes/ActionMailer/Base.html
gitlab_rails[‘smtp_openssl_verify_mode’] = ‘none’


Once the config has been updated, restart the container (docker restart gitlab).

Access the web site and you’ll be prompted to set a password for the admin user, root. You can click the ‘ldap’ tab and log in with Active Directory credentials. Fin.

If we deploy this for a production system, I would set up SSL on the web site and possibly externalize the GitLab database to MySQL. The external database is more of an academic experiment because we already use MySQL (and I still don’t want  to learn about vacuuming PostgreSQL).

Self Driving Cars (or Market Driven Algorithms)

I don’t see much of a future for self-driving passenger vehicles. There are two non-tenable options for crash avoidance algorithms. Either the algorithm prioritizes my life and property (which means it would kill someone else to save my life … good for me, bad for society) or it won’t (great for society, but am I going to pay money for a car that will literally kill me to save someone else?). Does the computer assisted human driving model suffer this flaw? An algorithm that engages the brakes any time there is an obstacle within X feet fails to consider the vehicle that is about to slam into the side of your car if you don’t move it into the shrubbery ahead of you.

Self-driving unoccupied vehicles can simply de-prioritize itself (and the owner needs to accept that financial risk). We may see driving as a service (DaaS?) where a real human is responsible for making these split-second decisions. But allowing people to achieve the metro experience in their own vehicle (i.e. you sit and work for half an hour whilst your conveyance delivers you to your destination) is probably not going to happen.

Web-Accessible History From OpenHAB MySQL Persistence Database

My husband has wanted a quick/easy way to see the data stored in OpenHAB’s MySQL persistence database. He didn’t care for the mysql command line client. He didn’t care for PHPMyAdmin either. I’ve suggested the MyODBC client — which allows you to use MySQL databases as an ODBC data source so you can view the data in MS Access, Excel, etc. Nope – he wanted a web site.

So I put together a very quick (and ugly) PHP page that provides a list of all Items. If you click on an item, you can page through the item’s records. The index.php from the page is available here. You need a web server (I am using Apache on Fedora), PHP (I am using 5.6) and MySQLi (php-mysqlnd package).

This is a bit of paranoia on my part, but even on a page that is ONLY available internally … I don’t like to use an account with read/write access to display data. I create a new user and assign read access:

CREATE USER 'YourUserName'@'localhost' IDENTIFIED BY 'P#ssw0rdH3r3';
GRANT SELECT ON openhabdb.* to 'YourUserName'@'localhost';
FLUSH PRIVILEGES;

Then use *that* user in the php code. This example has a web server running on the database server – and you connect to the MySQL server via localhost. If your web server is located on a different host, you’ll need to create and grant ‘YourUserName’@ the web server hostname.

Custom Password Filter Update (unable to log on after changing password with custom filter in place)

I had written and tested a custom Active Directory password filter – my test included verifying the password actually worked. The automated testing was to select a UID from a pool, select a test category (good password, re-used password, password from dictionary, password that doesn’t meet character requirements, password containing surname, password containing givenName), set the password on the user id. Record the result from the password set, then attempt to use that password and record the result from the bind attempt. Each test category has an expected result, and any operation where the password set or bind didn’t match the expected results were highlighted. I also included a high precision timer to record the time to complete the password set operation (wanted to verify we weren’t adversely impacting the user experience). Published results, documented the installation and configuration of my password filter, and was done.

Until the chap who was installing it in production rang me to say he couldn’t actually log in using the password he set on the account. Which was odd – I set one and then did an LDAP bind and verified the password. But he couldn’t use the same password to log into a workstation in the test domain. Huh?? I actually knew people who wanted *some* users to be able to log in anywhere and others to be restricted to LDAP-only logons (i.e. web portal stuff) and ended up using the userWorkstations attribute to allow logon to DCs only.

We opened a case with Microsoft and it turns out that their Password Filter Programming Considerations didn’t actually mean “Erase all memory used to store passwords by calling the SecureZeroMemory function before freeing memory.” What they meant was “If you have created copies of the password anywhere within your code, make sure you erase memory used to store those copies by calling SecureZeroMemory …”

Which makes SO much more sense … as the comments in the code I used as our base says, why wouldn’t MS handle wiping the memory? Does it not get cleaned well if you don’t have a custom password filter?? Remarked out the call to SecureZeroMemory and you could use the password on NTLM authentications as well as kerberos!

// MS documentation suggests doing this. I honestly don’t know why LSA
// doesn’t just do this for you after we return. But, I’ll do what the
// docs say…
// LJR – 2016-12-15 Per MS, they actually mean to wipe any COPIES you make
// SecureZeroMemory(Password->Buffer, Password->Length);

 

I’ve updated my version of the filter and opened an issue on the source GitHub project … but if anyone else is working a custom password filter, following MS’s published programming considerations, and finds themselves unable to use the password they set … see if you are zapping your copies of the password or the PUNICODE_STRING that comes in.

Active Directory: Custom Password Filtering

At work, we’ve never used the “normal” way of changing Windows passwords. Historically, this is because computers were not members of the domain … so you couldn’t use Ctrl-Alt-Del to change your domain password. Now that computers are members of the domain, changing Active Directory passwords using an external method creates a lot of account lockouts. The Windows workstation is logged in using the old credentials, the password gets changed without it knowing (although you can use ctrl-alt-del, lock the workstation unlock with the new password and update the local workstation creds), and the workstation continues using the old credentials and locks the account.

This is incredibly disruptive to business, and quite a burden on the help desk … so we are going to hook the AD-initiated password changes and feed them into the Identity Management platform. Except … the password policies don’t match. But AD doesn’t know the policy on the other end … so the AD password gets changed and then the new password fails to be committed into the IDM system. And then the user gets locked out of something else because they keep trying to use their new password (and it isn’t like a user knows which directory is the back-end authentication source for a web app to use password n in AD and n-1 in DSEE).

long time ago, back when I knew some military IT folks who were migrating to Windows 2000 and needed to implement Rainbow series compliant passwords in AD – which was possible using a custom password filter. This meant a custom coded DLL that accepted or rejected the proposed password based on custom-coded rules. Never got into the code behind it – I just knew they would grab the DLL & how to register it on the domain controller.

This functionality was exactly what we needed — and Microsoft still has a provision to use a custom password filter. Now all we needed was, well, a custom password filter. The password rules prohibit the use of your user ID, your name, and a small set of words that are globally applied to all users. Microsoft’s passfilt.dll takes care of the first two — although with subtle differences from the IDM system’s rules. So my requirement became a custom password filter that prohibits passwords containing case insensitive substrings from a list of words.

I based my project on OpenPasswordFilter on GitHub — the source code prohibits exact string matches. Close, but not quite 🙂 I modified the program to check the proposed password for case insensitive substrings. I also changed the application binding to localhost from all IP address since there’s no need for the program to be accessed from outside the box. For troubleshooting purposes, I removed the requirement that the binary be run as a service and instead allowed it to be run from a command prompt or as a service.  I’m still adding some more robust error handling, but we’re ready to test! I’ve asked them to baseline changing passwords without the custom filter, using a custom filter that has the banned word list hard coded into the binary, and using a custom filter that sources its banned words list from a text file. Hopefully we’ll find there isn’t a significant increase in the time it takes a user to change their password.

My updated code is available at http://lisa.rushworth.us/OpenPasswordFilter-Edited.zip

PHP 7.0 and MySQL Libraries

At work, we have some servers running unsupported operating systems. New servers are being built, and applications are being migrated from the old servers to the new. I started with a fairly easy scenario – a PHP web site running on Windows 2008 is moving to 2012. The new web server was handed off to me, and I loaded PHP. With PHP 5.6 active support ending at the end of this year, it made sense to install PHP 7. Copied code, tested site. Umm, massive fail.

Way back in PHP 5.5, the ext/mysql stuff (ext\php_mysql.dll for Windows folks) was deprecated. And if you are like me, you had a lot of old code from back when that was the way to connect to a MySQL database. And as your MySQL was upgraded past 4.1, you had the DBA’s setting old_password on your ID so your code continued to work.  But the old mysql libraries have been removed in PHP 7… and you need to use MySQLi or pdo_mysql to communicate with your database now.

Which one? Depends on what you need – I’ve been using PDO because I don’t need a procedural API (MySQLi provides a procedural API, PDO does not). PDO supports a dozen or so database drivers, MySQLi is just MySQL … so I’ll be able to use the same basic code to connect to MySQL, MS SQL, Oracle, and db2 (plus a handful of others that I don’t anticipate actually using, but who knows what the future holds).

I found a site (http://archive.jnrbsn.com/2010/06/mysqli-vs-pdo-benchmarks) where the individual has benchmarked MySQLi and PDO and doesn’t find much difference on INSERT statements but does see a non-negligible difference on prepared and non-prepared SELECT statements. His post is fairly old, so I ran timed tests on my server using our existing data and found PDO was within a couple of milliseconds. Using either library requires some recoding, but it is fairly straightforward and I was using a script to rebuild my script with the new functions. So I have a nice new server with nice new PHP and nice new MySQL queries using PDO … hit the page to test it and I get a generic error. Add a few lines to my code so I get some sensible errors

     error_reporting(E_ALL);
     ini_set('display_errors',1);

 

Voila – umm, this is gonna be a problem:

Next PDOException: SQLSTATE[HY000][2054] The server requested authentication 
method unknown to the client 
in D:\vhtml\PKIHome\IssuedDeviceCerts\index.php:46

Stack trace:

#0
D:\vhtml\PKIHome\IssuedDeviceCerts\index.php(46):
PDO->__construct('mysql:host=acil...', 'uidsuppressed', 'pwdsuppressed',Array)

The ‘server requested authentication method unknown to the client” means that the new PDO and MySQLi (yes, I’ve tried both) cannot use the password as required for the currently running production code. And the library used in the currently running production code cannot use the password as required for PDO or MySQLi. I cannot just convert the code to the new method, drop it on the new server, cut over, and decommission the old box. There are two approaches that can be used:

**************************************************

#1 Recode against a development MySQL database

#2 Get new IDs using the new storage scheme

**************************************************

#1 If you have a development MySQL database, you can add a hosts file entry (or have your OS support team do so) to point your production database host to the development server. The development server should be refreshed with data from the production databases. The existing IDs that use the old password storage schema can be updated with the new password storage scheme (either you provide the current password or a new password can be set). You will then need to update your PHP code to use either PDO or MySQLi. The implementation CRQ to move to your new server then involves (a) having the DBAs update the production user ID to the new password storage scheme, (b) removing the hosts file from your server.

 

Advantage – You don’t need to change to a new user ID in your code.

Disadvantage – anything that uses this ID needs to be updated simultaneously. When the new password storage schema is used on the account, any client requiring the old password storage scheme will fail. If your ID is used for one specific application on one server, then this isn’t a big deal. If you
use the ID to write data from a batch server or middleware platform, and then read the data from a PHP site … you need to recode both to use a library that understands the new password storage scheme.

**************************************************

#2 The other option is to get a new ID created that uses the new password storage scheme and have the same permissions granted for that ID. You can then recode individual pages as they are moved to the new server, and the old ID can be removed when all of the sites using it have been moved.

 

Advantage –You don’t need to move everything at once.

Disadvantage – you are making more changes to your code and replacing all of your user IDs (if you have a MyODBC driver to link an Access table into the database or if you use the MyPHPAdmin site … you’ll need to remember the new account now).

**************************************************

This isn’t a fatal error that prevents the upgrades from being done, but it sure turned into more of an undertaking than I had originally anticipated! If you should happen to use PHP and MySQL using the old libraries and have a PHP 7 installation planned … it really isn’t just copy some files & update some function calls.

 

Home Automation Lagering

We are about to make mead (we got near 30 pounds of local honey!). In researching mead-making, different yeasts have different alcohol tolerances … so you make a dry mead by using a yeast with an alcohol tolerance at or above the level your starting gravity would yield if it were fully fermented. A sweeter mead means you have a yeast whose tolerance is lower than that value … the greater the difference, the sweeter the mead. We are going to make a dry mead with Lalvin 71b-1122, a just slightly sweet mead by adding a little more honey but still using Lalvin 71b-1122, and a sweeter mead using Lalvin D-47.

71b-1122 has a very broad temperature range (59-86 F – and how cool is it that Google returns a yeast profile summary if you search for “71b-1122 temperature range”). D-47 is more particular — a published range of 59-68 F, but reading through homebrew sites has us wanting to stay around 63 degrees. Our sub-grade level is cool, but not that cool. Especially as fermentation warms up the fluid.

Scott is developing a home automation controlled fermentation “chamber”. The beer refrigerator is now plugged into a smart outlet. One of the Arduino kits we got has a temperature sensor. We can have a temperature probe monitoring the must and cycle the refrigerator’s power to keep it within a degree or two of our target.

Programming in Unknown Languages

I’ve often thought that the immersion method of learning a language was setting yourself up for failure – it isn’t like knowing the fundamentals of grammar and pronunciation in English helps you in any way when you find yourself in Karnataka trying to communicate in Sanskrit. There are rather complex algorithms that attempt to derive meaning from an unknown language, but apart from body language, pointing, and gesturing … that’s not something I can manage in real-time as someone speaks to me.

*Programming* languages, on the other hand, I am finding are rather easily learnt by immersion. I know several programming languages quite well – C/C++, F77/F90, perl, and php. I know a dozen or so other languages well enough to get by.

Some of our home automation scripts are written in CoffeeScript (which is evidently a way to write JavaScript without *actually* knowing JavaScript) – and I would never be able to write the program. But to come into the middle of the conversation (i.e. to take someone else’s non-functional code and try to fix it), I can glean enough of the language to debug and fix code. And there’s always Google for any syntax I cannot guess.

I wonder if someone who is fluent in multiple disparate languages (knowing half a dozen Romance languages doesn’t really give you a good base of knowledge – I mean someone who speaks Italian, Hindi, Cantonese, Swahili, and some Levantine dialect of Arabic) is able to do something similar — they know enough words to pretty much guess what words mean & enough different language structures to guess words in their context.