Fix Splunk lockout after exceeded quota

Recently I came across a situation with my home install of Splunk (free license) where the 500MB quota was exceeded three days in a row. I hadn’t checked Splunk for a few days so I was completely blindsided by it. The consequence of going over quota three days in a row? Losing the ability to do any searches in Splunk, which is a real downer.

The easiest, although least convenient, way to fix being locked out is to wait it out. If you go 30 days in a row without violating the license, Splunk will unlock itself. Splunk will still receive and index events during that time. The inability to search makes it really difficult to track down what the problem is, though, and I wasn’t happy waiting for 30 days before getting Splunk back.

Poking around on the Splunk forums I discovered that there is a way to get splunk back – perform a fresh install and then migrate your database and settings over to the fresh install. This involves backing up a few things, then copying them over the fresh install’s default folders

  • $SPLUNK_HOME/var/lib/splunk/defaultdb   #Default Splunk index, where all my data is held. If you have other indexes in here you’ll want to copy them too.
  • $SPLUNK_HOME/etc  #all your configuration files

Simply back up the above folders, install Splunk on a new machine, launch Splunk first so it will generate all the default files, then copy the files over to the new instance.

I went a step further and planned for the future. I wrote a quick and dirty script that will do all of this for you,  even on the same machine – no need to copy to another machine.  The script assumes you’re running a redhat derivative and have the correct Splunk install file in a predictable location. Update the locations of splunk directories and install files as needed and run as root.

#!/bin/bash

#Backup important directories
mkdir /opt/splunkbackup/
cp -al /opt/splunk/etc /opt/splunkbackup/
cp -al /opt/splunk/var/lib/splunk/defaultdb /opt/splunkbackup/

#Nuke splunk
/opt/splunk/bin/splunk stop
rm -rf /opt/splunk

#Reload from fresh start
rpm -iv --replacepkgs /home/nicholas/splunk-6.2.2-255606-linux-2.6-x86_64.rpm
/opt/splunk/bin/splunk start --accept-license

#Restore configuration files and indexes
/opt/splunk/bin/splunk stop
rm -rf /opt/splunk/etc
cp -al /opt/splunkbackup/etc /opt/splunk/
rm -rf /opt/splunk/var/lib/splunk/defaultdb
cp -al /opt/splunkbackup/defaultdb /opt/splunk/var/lib/splunk/
chown splunk:splunk -R /opt/splunk/
/opt/splunk/bin/splunk start

#Remove splunk backup
rm -rf /opt/splunkbackup

This will restore your searches, settings, and data. It won’t restore audit and other internal Splunk information, however. This script worked marvelously in getting my Splunk back.

Fix owncloud client sync “not valid JSON!” error

Recently I migrated my owncloud installation from one webserver to another. I learned that all you have to do is copy the data/ and config/ directories from your owncloud installation to the new machine’s owncloud folder to migrate everything over.

After the migration, I noticed the Windows desktop sync clients stopped working (android worked fine, though.) The main error messasge was not helpful:

Failed to connect to owncloud at https://servername/path/status.php: Unknown error

I found out from here that you can alter the shortcut for the Windows client and append –logwindow to the Target section. Once that was done, I was able to get more information:

03-02 10:01:27:240 0x50f9ec8 networkjobs.cpp:453 status.php from server is not valid JSON!

Manually navigating to status.php in the browser didn’t reveal anything – it appeared to load normally.

After much digging I found a suggestion to check the admin settings within owncloud. This is where I realized I probably didn’t migrate things properly. There was a big warning about invalid .htaccess files. Progress!

The lack of an .htaccess file made me realize that instead of completely moving the entire folder from the old owncloud install to the new, I needed to copy into the existing new owncloud directory. In moving instead of copying I somehow missed a few important files.

I started over, this time copying all files inside data and config into the new owncloud data and config directories. Apparently the Windows sync client requires valid .htaccess configuration before it will work.  Success!

Troubleshoot RSA SecurID in CentOS 6

Unexpected error from ACE/Agent API.

In following this guide for configuring a CentOS 6 system to authenticate with RSA SecurID I came across an unusual error message that had me scratching my head:

Unexpected error from ACE/Agent API.

The problem stemmed from having an incorrect value in the /var/ace/sdopts.rec file for CLIENT_IP. For some reason I had put the IP address of the RSA authentication server in there. CLIENT_IP is the IP address of the RSA client, or rather, the machine you’re working on. The client uses whatever’s in that file to report to the RSA server what its IP address is. If the RSA server gets an invalid IP response from the client, it won’t authenticate.

SELinux issues

Much blood and tears were shed in dealing with getting SELinux to exist harmoniously with RSA SecurID. The problem was exacerbated my the fact that there is a lot of half solutions and misinformation floating out there on the internet. This will hopefully help fix that.

The message entry does not exist for Message ID: 1001

At this point acetest worked beautifully but I could not use an RSA passcode to SSH into the system. Digging into the log revealed this error message:

sshd[2135]: ACEAGENT: The message entry does not exist for Message ID: 1001

Thanks to this post, I realized it was due to selinux. Modifying the selinux config information to allow /var/ace to be read, per the commands below, seemed to fix the issue.

setenforce 0
chcon -Rv --type=sshd_t /var/ace/
setenforce 1

But alas! The solution was not a very good one. The commands above have two problems with them: first, the chcon command is temporary and does not survive selinux policy relabels; second, it assigns the type sshd_t, which does allow SSH to access it, but revokes RSA SecurID’s ability to write to the directory. This is a problem if you ever need to clear node secrets. The server will initiate the wipe but the client will not be able to modify that directory, resulting in node secret mismatches.

I finally decided to RTFM and landed on this documentation page, which explained the issue I was having: selinux mislabeling. The proper solution to this problem is use a label that both SecurID and SSHD can write / read to. Thanks to this SELinux Manpage (it really pays to RTFM!) I discovered that the label I want is var_auth_t (the default label applied when creating /var/ace is var_t, which SSH can’t read.) 

To survive relabeling, use the semanage command, which is not installed by default. Thanks to this site I learned I must install policycoreutils-pithon:

yum install policycoreutils-python

Once semanage is installed, use it to change the label for /var/ace and everything inside it to var_auth_t, then apply the changes with restorecon:

semanage fcontext -a -t var_auth_t "/var/ace(/.*)?"
restorecon -R -v /var/ace

Finally, both RSA SecurID and OpenSSH can read what they need to and authentication is successful.

First acetest succeeds but subsequent ones fail

If you followed the bad advice of relabeling /var/ace to sshd_t you might run across a very frustrating issue where acetest would succeed, but any attempts to SSH into the box or even run acetest again would fail. The error message on the RSA SecurID server was

Node secret mismatch: cleared on server but not on agent

The problem is due to the improper SELinux labeling mentioned above. The fix is the same:

yum install policycoreutils-python
semanage fcontext -a -t var_auth_t "/var/ace(/.*)?"
restorecon -R -v /var/ace

SSH access denied even with successful acetest

If acetest succeeds and you’ve loaded the module into PAM but still get access denied, it could be due to your SSH configuration. Ensure the following options are set:

ChallengeResponseAuthentication yes 
UsePrivilegeSeparation no

Victory.

Join a CentOS machine to an AD domain

I ran into enough snags when attempting to join an CentOS 6.6 machine to a Microsoft domain that I thought I would document them here. Hopefully it is of use to someone. The majority of the experience is thanks to this site.

Update 03/16/2015: I came across this site which makes things a little easier when it comes to initial configuration – messing with other config files is no longer necessary. The authconfig command to do this is below:

authconfig --disablecache --enablelocauthorize --enablewinbind --enablewinbindusedefaultdomain --enablewinbindauth        --smbsecurity=ads --enablekrb5 --enablekrb5kdcdns --enablekrb5realmdns --enablemkhomedir --enablepamaccess --updateall        --smbidmapuid=100000-1000000 --smbidmapgid=100000-1000000 --disablewinbindoffline --winbindjoin=Admin_account --winbindtemplateshell=/bin/bash --smbworkgroup=DOMAIN --smbrealm=FQDN --krb5realm=FQDN

Replace DOMAIN with short domain name, FQDN with your fully qualified domain name, and Admin_account with an account with domain admin privileges, then skip to the Reboot section, as it covers everything before that.

Install the necessary packages

yum -y install authconfig krb5-workstation pam_krb5 samba-common oddjob-mkhomedir

Configure kerberos auth with authconfig

There is a curses-based GUI you can use to do this in but I opted for the command line.

authconfig --disablecache --enablewinbind --enablewinbindauth --smbsecurity=ads --smbworkgroup=DOMAIN --smbrealm=DOMAIN.COM.AU --enablewinbindusedefaultdomain --winbindtemplatehomedir=/home/DOMAIN/%U --winbindtemplateshell=/bin/bash --enablekrb5 --krb5realm=DOMAIN.COM.AU --enablekrb5kdcdns --enablekrb5realmdns --enablelocauthorize --enablemkhomedir --enablepamaccess --updateall

Add your domain to kerberos configuration

Kerberos information is stored in /etc/krb5.conf. Append your domain in the realms configuration, like below

vi /etc/krb5.conf
[realms]
 EXAMPLE.COM = {
 kdc = kerberos.example.com
 admin_server = kerberos.example.com
 }
 
DOMAIN.COM.AU = {
admin_server = DOMAIN.COM.AU
kdc = DC1.DOMAIN.COM.AU
kdc = DC2.DOMAIN.COM.AU
}
 
[domain_realm]
 .example.com = EXAMPLE.COM
 example.com = EXAMPLE.COM
 domain.com.au = DOMAIN.COM.AU
 .domain.com.au = DOMAIN.COM.AU

 Test your configuration

Use the kinit command with a valid AD user to ensure a good connection with the domain controllers:

kinit <AD user account>
It should return you to the prompt with no error messages. You can further make sure it worked by issuing the klist command to show open Kerberos tickets
klist

Ticket cache: FILE:/tmp/krb5cc_0
Default principal: someaduser@DOMAIN.COM.AU
Valid starting Expires Service principal
02/27/14 12:23:21 02/27/14 22:23:21 krbtgt/DOMAIN.COM.AU@DOMAIN.COM.AU
renew until 03/06/14 12:23:19
When I tried the kinit command it returned an error:
kinit: KDC reply did not match expectations while getting initial credentials
 After scratching my head for a while I came across this site, which explains that your krb5.conf is case sensitive – it must all be all upper case. Fixing my krb5.conf to be all caps for my domain resolved that issue.

Join the domain

net ads join domain.com.au -U someadadmin
When I tried to join the domain I received this lovely message:
Our netbios name can be at most 15 chars long, "EXAMPLEMACHINE01" is 16 chars long
Invalid configuration. Exiting....
Failed to join domain: The format of the specified computer name is invalid.
Thanks to Ubuntu forms I learned I needed to edit my samba configuration to assign an abbreviated NETBIOS name to my machine.
vi /etc/samba/smb.conf
Uncomment the “netbios name =” line and fill it in with a shorter (max 15 characters) NETBIOS name.
netbios name = EXAMPLE01
You can test to ensure the join was successful with this command
net ads testjoin

Configure home directories

The authconfig command above included a switch for home directories. Make sure you create a matching directory and set appropriate permissions for it.

mkdir /home/DOMAIN
setfacl -m group:"Domain Users":rwx /home/DOMAIN #the article calls to do this, this command doesn't work for me but home directories still appear to be created properly

Reboot

To really test everything the best way is to reboot the machine. When it comes back up, log in with Active Directory credentials. It should work!

Account lockout issues

I ran into a very frustrating problem where everything works dandy if you get the password correct on the first try, but if you mess up even once it results in your Active Directory account being locked. You were locked out after the first try. Each login, even when successful, had this in the logs:

winbind pam_unix(sshd:auth): authentication failure

This problem took a few days to solve. Ultimately it involved modifying two files:

vi /etc/pam.d/system-auth
vi /etc/pam.d/password-auth

As far as I can tell, the problem was a combination of pam_unix being first (which always failed when using AD login), as well as having both winbind and kerberos enabled. The fix was to change the order of each mention of pam_unix to be below any mention of pam_winbind. The other fix I had to do was to comment out mentions of pam_krb5 completely.

#auth        sufficient    pam_krb5.so use_first_pass

Restrict logins

The current configuration allows any domain account to log into the machine. You will probably want to restrict who can log in to the machine to certain security groups. The problem: many Active Directory security groups contain spaces in their name, which Linux doesn’t like.

How do you add a security group that contains a space? Escape characters don’t seem to work in the pam config files.  I found out thanks to this site that it is easier to just not use spaces at all. Get the SID of the group instead.

Use wbcinfo -n to query the group in question, using the backslash to escape the space. It will return the SID we desire.

wbinfo -n Domain\ Users
S-1-5-21-464601995-1902203606-794563710-513 Domain Group (2)

Next, modify /etc/pam.d/password-auth and add the require_membership_of argument to pam_winbind.so:

auth        sufficient    pam_winbind.so require_membership_of=S-1-5-21-464601995-1902203606-794563710-513

That’s it! Logins are now restricted to the security group listed.

Configure sudo access

Sudo uses a different list for authorization, which amusingly, handles escaped spaces just fine.  Simply add the active directory group in sudo as you a local one, eg using a % and then group name, escaping spaces with a backslash:

%Domain\ Users ALL=(ALL) ALL

Rejoice

You’ve just gone through a long and painful battle. Hopefully this article helped you to achieve victory.

Generate SSL certificate for use with Sophos UTM

HTTPS certificate handling in Sophos UTM is a bit different than other systems. I do this often enough but never remember exactly how to do it.

Here are the “cliff notes” of getting an SSL certificate loaded into Sophos UTM. This can be done on any linux / unix system with openssl installed. The full guide was taken from here.

Generate a private key

When creating your key, make sure you use a passphrase.

openssl genrsa -aes256 -out <keyname>.key 2048

Create a certificate signing request (CSR)

openssl req -new -key keyname.key -out csrname.csr

Upload CSR to your certificate company

Sophos UTM uses Openssl so select that option if prompted by your certificate company Specify Apache CSR if asked. Validate your domain ownership, then wait for e-mail with response.

Download output from certificate company

If they give you a zip file, unzip it first

unzip file_from_authority.zip

Combine all files provided into one

You only have to do this if your CA provides more than one CRT file

cat CA1.crt CA2.crt ...   >  combined.crt

Generate p12 file for use with UTM

Generate a pkcs12 file by supplying all files generated above. Be sure to specify an export password (Sophos requires one.)

openssl pkcs12 -export -in combined.crt -inkey <keyname>.key -out desired_p12_file_name.p12

Upload into Sophos UTM

Navigate to certificate management and specify upload key. Upload the file. Be sure to enter the password you used when creating the key earlier.

That’s it!

Configure iSCSI initiator in CentOS

Below are my notes for configuring a CentOS box to connect to an iSCSI target. This assumes you have already configured an iSCSI target on another machine / NAS. Much of this information comes thanks to this very helpful website.

Install the software package

1
yum -y install iscsi-initiator-utils

Configure the iqn name for the initiator

1
vi /etc/iscsi/initiatorname.iscsi
1
2
InitiatorName=iqn.2012-10.net.cpd:san.initiator01
InitiatorAlias=initiator01

Edit the iSCSI initiator configuration

1
vi /etc/iscsi/iscsid.conf
1
node.startup = automatic
node.session.auth.authmethod = CHAP
node.session.auth.username = initiator_user
node.session.auth.password = initiator_pass
#The next two lines are for mutual CHAP authentication
node.session.auth.username_in = target_user
node.session.auth.password_in = target_password

Start iSCSI initiator daemon

1
2
/etc/init.d/iscsid start
chkconfig --levels 235 iscsid on

Discover targets in the iSCSI server:

1
2
iscsiadm --mode discovery -t sendtargets --portal 172.16.201.200 the portal's IP address
172.16.201.200:3260,1 iqn.2012-10.net.cpd:san.target01

Try to log in with the iSCSI LUN:

1
2
3
iscsiadm --mode node --targetname iqn.2012-10.net.cpd:san.target01 --portal 172.16.201.200 --login
Logging in to [iface: default, target: iqn.2012-10.net.cpd:san.target01, portal: 172.16.201.200,3260] (multiple)
Login to [iface: default, target: iqn.2012-10.net.cpd:san.target01, portal: 172.16.201.200,3260] successful.

Verify configuration

This command shows what is put into the  iSCSI targets database  (the files located in /var/lib/iscsi/)

1
cat /var/lib/iscsi/send_targets/172.16.201.200,3260/st_config
1
2
3
4
5
6
7
8
9
10
11
12
discovery.startup = manual
discovery.type = sendtargets
discovery.sendtargets.address = 172.16.201.200
discovery.sendtargets.port = 3260
discovery.sendtargets.auth.authmethod = None
discovery.sendtargets.timeo.login_timeout = 15
discovery.sendtargets.use_discoveryd = No
discovery.sendtargets.discoveryd_poll_inval = 30
discovery.sendtargets.reopen_max = 5
discovery.sendtargets.timeo.auth_timeout = 45
discovery.sendtargets.timeo.active_timeout = 30
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768

Verify session is established

1
2
iscsiadm --mode session --op show
tcp: [2] 172.16.201.200:3260,1 iqn.2012-10.net.cpd:san.target01

Create LVM volume and mount

Add our iSCSI disk to a new LVM physical volume, volume group, and logical volume

1
2
3
4
5
6
7
fdisk -l
Disk /dev/sdb: 17.2 GB, 17171480576 bytes
64 heads, 32 sectors/track, 16376 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
1
Disk /dev/sdb doesn't contain a valid partition table
1
2
pvcreate /dev/sdb
vgcreate iSCSI /dev/sdb
lvcreate iSCSI -n volume_name -l100%FREE
mkfs.ext4 /dev/iSCSI/volume_name

Add the logical volume to fstab

Make sure to use the mount option _netdev.  Without this option, Linux will try to mount this device before it loads network support.

1
vi /etc/fstab
/dev/mapper/iSCSI-volume_name    /mnt   ext4   _netdev  0 0

Success.