Performancing Metrics

Vamsi Tokala's blog

Thursday, May 26, 2016

Detecting LDAP Performance Issues

Detecting LDAP Performance Issues

We had detected significant degrade in response times recently related to LDAP during the load tests and we were not sure the reasons for the slowness.

I took the following approach to debug the issue

  1. Captured the LDAP traffic in the server using TCP Dump( Linux)
  2. Import the dump in Wireshark
  3. Modify Display Filter
        Tcp.dstport == port no
     
     
     
  4. and decoded the traffic as shown below
    1. Value is port used to communicate the LDAP Server
    2. LDAP is the communication protocal in Wireshark


 

5 TCP traffic is more readable and I was able to detect the issue with LDAP search against our application and administrators groups



@2016, copyright Vamsidhar Tokala

Wednesday, December 9, 2015

Interesting Work - Part2

System Setup
Ø  System was configured as below for performance benchmark
o   No of Claims Engine Servers configured – 20
o   No of Instances of Adjudication service running per Claims Engine Server – 4
o   No of Threads per Adjudication Service – 5


 

Test Execution

Soon after being engaged for the performance validation effort, we engaged in the following Activities
11) A new environment was setup for the Proof of Concept.
22) Developed automated scripts to Load members, providers and claims
33) Loaded data incrementally to reach the expected volumes via automated SQL scripts
o   24 Million Members,
o   1 Million Providers,
o   60 million Claims  
44) Initially Tested adjudication with only professional claims to validate the scalability and performance.  Distributed the Adjudication service across multiple servers.  

Performance was slow even with 50,000 Claims

No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
1000
Professional
4
1
5
1.5
18000
Professional
4
1
5
40 min
50,000
Professional
4
10
5
2 hrs(stopped)
50,000
Professional
4
3
5
2 hrs 20min

Analysis
a.    Observed several queries from portal and other jobs blocking the adjudication service. The following were the top 5 worst queries at the time of the test.
 Name
max_logical_reads
max_elapsed_time
udp_Mem_Enrollment
146322516
805341120
plx_cl_dataset
1058518
29350681
claim_result_code
14038
2145120
claim_run__code
14038
15784905
plx_get_dataset
3326
2042115

b.    These queries are originating from Portal servers. Decision taken to shut down all the server except Claims Engine and SQL servers

c.     Response time Improved from 2 hr to 34 min for 50K after the shutdown
No of Claims
Claim Type
No of Servers
No of Adjudication 
services/Server
No of  Threads/
service
Processing
 Time
(in minutes)
50,000
Professional
4
3
5
2 hrs 20min
50,000
Professional
4
3
5
34 min
 5)   Tests with 200K professional claims
Ran series of performance tests with 200K professional claims. We hit 2 hrs mark with the test
·         CPU utilization in DB was around 60-80%
·         Memory around 92%

No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
Comments
200,000
Professional
4
10
5
2 hr 30 min
 With 10 claims engine servers
200,000
Professional
4
6
5
 2 hr 05 min
Reduced no of servers to avoid DB contention
Analysis:
a.    Reducing the no of Claims engine servers from 10 to 6 improved the processing time by 20%. We found that the database becoming a bottleneck on increase in the no of servers

66)   To test and tune some of the config changes faster, Ran series of tests with reduced volumes.

Improved the performance incrementally via snapshot isolation, adding appropriate indexes, partitioning etc.

No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
Comments
50,000
Professional
4
3
5
35 min
Turned off ALLOW_PAGE_LOCKS and set LOCK_ESCALATION=DISABLE  on adjudication_key Table - Reverted back the changes
50,000
Professional
4
3
5
33 min
tested Dedicated Queue per service in each server
Reverted back the changes
50,000
Professional
4
3
5
33 min
Modified partition for adjudication_key table based on adjudication object id
50,000
Professional
4
3
5
39 min
Created index for procedure code table which improved Index seeks, key lookups etc.
50,000
Professional
4
3
5
32 min
Snapshot isolation is enabled in DB
7.   Observed one of Stored procedures was slow during the test

a.    Observed plx_counter stored proc take more time in the process of adjudication
SP Name
TotalLogicalReads
AvgLogicalReads
execution_count
Calls/Second
total_elapsed_time
avg_elapsed_time
plx_counter
5127899
26
190674
3
2746844057
14405

b.    plx_counter is updated with a sequence in DB and elapsed time improved from 14405 micro sec to 118 micro secs
SP Name
TotalLogicalReads
AvgLogicalReads
execution_count
Calls/Second
total_elapsed_time
avg_elapsed_time
plx_counter
12
0
7145
27
849034
118

c.    The below tests executed after the stored proc is updated


No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
50000
Professional
4
3
5
31 min
200000
Professional
4
6
5
2 hrs
87)   No of Virtual CPU increased from 8 to 16 in Database server and memory increased to 32 GB

Not much improvement observed with the increase of logical CPU’s and memory. Network edge is configured to low in terms of bandwidth. It was modified to high now and results were same, even after modifying to high

No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
Comments
200000
Professional
4
6
5
2hrs 30 min +
With 16 core processer and 32 GB memory,
200000
Professional
4
6
5
2 hrs 35 min
With 16 core processer and 32 GB memory
Changing the network edge to high
98)   Evaluated the below network configurations. However Results didn’t show any improvement in performance

·         Disable TCP Chimney.
·         Disable all offloading in vmxnet3 at OS level
·         Disable all offloading at NIC level

19)                VMWare Restrictions
a.    Memory
-          Though we see 32 GB of RAM in the DB box, there was an restriction in VMware to use only 16 GB (16 GB was driver locked)
-          Worked with VMware to reclaim and use all the 32 GB allocated
-          DB server memory utilization was not  beyond 80% after the reclaim
b.    CPU
-          Same as memory, we had reclaimed all the 16 core processer for usage
-          CPU utilization was not beyond 25%.

No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
200000
Professional
4
8
5
1 hr 6 min
200000
Mix
4
8
5
1 hr 16 min

110.                Hyper thread ratio not configured correctly IN VMWare

Hyper thread ratio is changed to 16.  Total Logical CPU is 32
No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
200000
Mix
4
10
5
55 min
200000
Mix
4
10
5
59 min
200000
Mix
4
10
5
62 min
200000
Mix
4
15
5
45 min


111).   Increased Claims History from 60 Million claims to 780 Million claims. Out of 780 Million claims, 9 Million were completed and in approved status and rest were in Open status.
No of Adjudicated Claims
Avg No of Claims per Member
Processing Time
9 Million
1
48 min
15
70 min
1
112) Observed only 20 out of 32 CPU cores utilized during the test due to SQL Server license limitations.  Reinstalled the SQL server with upgraded license that uses all cores.

Performance improved from 66 min to 55 min

No of Claims
Claim Type
No of Adjudication services/Box
No of Servers
No of Threads/service
Processing Time(in minutes)
200000
Mix
4
20
5
55 min

Database changes to Improve Performance

 There were several Configuration and Schema changes done to the database to enhance performance.
Ø  There was only one data file for TempDB for the entire server. Best practices state that there should be as many data files for TempDB as processors. At the time we changed the configuration on this, the number of processors was 8. It is now 16. We might want to consider adding more data files.
Ø  The default sizes of the TempDB data files was too small. The default setting of TempDB upon installation is 8 MB. When the server reboots, the TempDB data files will always resort back to its original size. The files were expanding a lot from regular usage so the default was changed to 2 GB with 1 GB used as the setting for the log file.
Ø  Max Degree of Parallelism: This was set to 40.
Ø  Network Packet Size: This was set to 10240.
Ø  Query Wait: This was set to 15.
Ø  Minimum Server Memory: This was set to 16GB.
Ø  Maximum Server Memory: This was set to 24GB.
Ø  Snapshot isolation enabled in DB

·         4 tables  related to 700 million claims were partitioned in the database.
·         They were partitioned by their clustered primary keys.
·         4 new file groups were created to separate the data from the rest of the database.

·         The default Fill Factor for indexes was changed to 80.
·         This was changed to Simple.









                                                                                                                   
@2015, copyright Vamsidhar Tokala