Computer Science authors titles recent submissions

(1)

Towards the Adoption of Anti-spoofing Protocols for Email

Systems

Hang Hu, Peng Peng, Gang Wang

Virginia Tech

{hanghu, pengp17, gangwang}@vt.edu

ABSTRACT

Email spoofing is a critical step of phishing, where the at-tacker impersonates someone the victim knows or trusts. In this paper, we conduct a qualitative study to explore why email spoofing is still possible after years of efforts to develop and deploy anti-spoofing protocols (e.g., SPF, DKIM, DMARC). First, we measure the protocol adoption by scanning 1 million Internet domains. We find the adoption rates are still low, espe-cially for the new DMARC (3.1%). Second, to understand the reasons behind the low-adoption rate, we collect 4293 discus-sion threads (25.7K messages) from the Internet Engineering Task Force (IETF), a working group formed to develop and promote Internet standards. Our analysis shows key security and usability limitations in the protocol design, which makes it difficult to generate a positive “net effect” for a wide adoption. We validate our results by interviewing email administrators and discuss key implications to future anti-spoofing solutions.

INTRODUCTION

Phishing attack is a known threat to the Internet. This threat has escalated significantly today due to the heavy involvement of phishing attacks in recent data breaches [44], ransomware outbreaks [30], and even political campaigns [13]. For exam-ple, email phishing has involved in nearly half of the 2000 reported breaches in 2016, responsible for leaking billions of data records [44].

Email spoofing is a critical step in phishing attacks where the attacker impersonates someone that the victim knows or trusts. By spoofing the email address of a reputable organization or a close friend, the attacker has a better chance to deceive the victim [20]. To prevent spoofing, there has been an active effort since early 2000 to develop, promote, and deploy anti-spoofing protocols. Protocols such as SPF [24], DKIM [8], and DMARC [29] have become the Internet standards, allowing email receivers to verify the sender’s identity.

Despite these efforts, however, sending spoofing emails is still surprisingly easy today. As an example, Figure 1 shows a spoofing email where the sender address is set as the U.S. Citizenship and Immigration Services (USCIS). We crafted and sent this email to our own account in Yahoo (as the victim), and it successfully reached the inbox without triggering any warnings. This is not a coincident as email spoofing is still widely used in real-world phishing attacks [44, 35, 13].

The real question is,why email spoofing is still possible af-ter years of efforts spent on the defense. In 2015, two mea-surement studies shed some lights [12, 17], where the result

Figure 1. A spoofing email that impersonates the U.S. Citizenship and Immigration Services (USCIS). We acted as the attacker and sent this email to our own account. The email arrived the inbox without trigger-ing any alert (Mr. Hu is a fictional name).

shows a relatively low adoption rate of anti-spoofing protocols. Among Alexa top 1 million domains, only 40% have adopted SPF and only 1% have DMARC. In 2017, we perform our own measurement as a pilot study and found that the adoption rates were still low (SPF 45.8%, DMARC 3.1%). However, it is not yet clear what causes the slow progress of adopting anti-spoofing solutions.

In this paper, we perform a qualitative study to understand why anti-spoofing protocols are not widely adopted. We first tried to interview email service providers, which turned out to be very difficult to realize. Part of the reason is that the candidate pool is small. People who can provide insights to our questions need to have real-world experience managing email services. In addition, email administrators often hesitate (or are not allowed) to share details about their anti-phishing/spoofing solutions, especially those working in major companies and organizations.

To solve these challenges, we follow a hybrid method com-bining empirical data analysis with small-scale but in-depth interviews. First, we focus on theprotocol designersto un-derstand the technical limitations of these protocols that are likely to create adoption barriers. We collect the public dis-cussion threads from The Internet Engineering Task Force (IETF), a working group formed to develop and standardize SPF, DKIM, and DMARC. By coding their discussions (4293 threads, 25.7K messages), we extract and categorize key us-ability and security issues in the protocol design.Second, we recruit two experienced email administrators from our local institution to conduct in-depth interviews. Although only two email administrators are interviewed, we obtain new insights from theprotocol adopters’ perspectives.

Our findings and contributions are as follows:

(2)

• First, we extract and categorize the security and usability problems rooted in the protocol design by mining the IETF discussion threads. Flaws in SPF and DKIM may allow cer-tain spoofing emails to bypass the authentication or mis-flag legitimate emails sent by mailing lists and mail forwarding. DMARC helps to resolve some of the issues but new prob-lems emerge. These technical probprob-lems negatively impact the perceived usefulness of these protocols.

• Second, there is a lack of a “critical mass” to bootstrap a wide adoption. As network protocols, SPF, DKIM and DMARC have little value unless a large number of Internet domains adopt the protocol. Our result suggests that email providers tend to wait for other people to get on board before making an adoption decision, making it challenging to bootstrap the adoption.

• Third, the costs and benefits for protocol adopters are unbal-anced, particularly for non-email domains. Domain owners are less motivated to publish their SPF/DKIM/DMARC records, since it primarily helps other email services to authenticate emails, but has limited benefits for themselves.

Our results raise key concerns about the current defense against spoofing attacks. We hope these findings can help researchers, email providers and policymakers to make more informed decisions to improve the situation. Moving forward, we argue that the technical issues in the protocol design need to be addressed, and external enforcement/incentives (e.g., from policymakers) are also needed to bootstrap the protocol adoption.

BACKGROUND AND RELATED WORK

We start by describing the email spoofing attack and existing anti-spoofing protocols. Then, we introduce relatedtechnology adoption theoriesto set up the contexts for our study.

SMTP and Email Spoofing

Simple Mail Transfer Protocol (SMTP) is the Internet standard for email transmission [36]. Figure 2 shows a typical email transmission process. A key limitation of SMTP is that it has no built-in security features to prevent people (attackers) from impersonating/spoofing an arbitrary sender address.

To perform a spoofing attack, attackers can manipulate two key fields to send emails. First, after establishing an SMTP connection in step❷_{, the attacker can use the “}_{MAIL FROM}_” command and set the sender address to anyone that they want to impersonate. After that, the “MAIL FROM” address is in-serted into the header as the “Return-Path”. In addition, attack-ers can modify another field called “From” in the email header. This “From” field specifies the address that will be displayed on the email interface [39]. When a user receives the email, the user will see the “From” address (e.g.,[email protected]in Figure 1). If the user replies the email, the reply message will go to the “Return-Path” set by “MAIL FROM”. Note that the two addresses are not always the same.

Email spoofing is a critical step of phishing attacks to gain the victim’s trust [38, 19, 20]. Meanwhile, spoofing is also a strong signal of phishing attacks [19, 25, 26, 31, 37]. Spoofing detec-tion results are often used by phishing detecdetec-tion systems [9, 11, 15, 18, 40].

1 2 3

HTTP SMTP

HTTP IMAP POP SMTP

a.com

Sender Service

b.com

Sender Client

Receiver Service

Receiver Client

[email protected] [email protected]

Figure 2. A typical email transmission process.

Anti-Spoofing Protocols

To detect and prevent email spoofing, SMTP extension pro-tocols are proposed including SPF, DKIM and DMARC. All three protocols have been published or standardized by the Internet Engineering Task Force (IETF).

SPF. Sender Policy Framework (SPF) was proposed in early 2000, and standardized in 2014 [24]. SPF allows a domain to publish a list of IPs that are authorized to send emails on its behalf. For instance, the domaina.com can publish its SPF record in the DNS. When the receiving server receives theMAIL FROMcommand claiming to [email protected], the receiving server can check if the sender IP is listed in the SPF record ofa.com.

DKIM. DomainKeys Identified Mail (DKIM) was first drafted in 2004 and standardized in 2011 [8]. DKIM uses a public-key based approach to authenticate the email sender and check the email integrity. More specifically, the sender’s email service will place a digital signature in the email header signed by the private key associated with the sender’s domain. The receiving service can retrieve the sender’s public key from DNS to verify the signature. To retrieve a DKIM public key from DNS, one will need the selector information (an attribute in the DKIM signature beside the domain name. By verifying the DKIM signature, the receiver can detect if the signed mes-sage has been modified, to ensure integrity and authenticity.

DMARC. Domain-based Message Authentication, Report-ing and Conformance (DMARC) was drafted in 2011 and published in 2015 [29]. DMARC is not a standalone protocol but needs to work with SPF and/or DKIM. DMARC allows the domain owner to publish a “failing policy” which speci-fies what actions the receiver should take when the incoming email fails the DMARC checks. In addition, DMARC requires identifier alignmentfrom SPF or DKIM. For SPF, alignment means thatMAIL FROMaddress used for the SPF check should be consistent with theFromfield in the header. For DKIM, alignment means that the domain name in the DKIM signature should match theFromfield. Alignment ensures the email address that user sees matches with the authenticated address.

The Low Adoption Rates of Anti-spoofing Protocols

In 2015, two measurement studies have shown that anti-spoofing protocols were not widely used among Internet do-mains [12, 17]. Among Alexa top 1 million dodo-mains [3], only 40% of the domains have published an SPF record and 1% have a DMARC record.

(3)

Status (year = 2017) # (%) out of 1 Million Domains With an SPF record 489,025 (48.9%) With a valid SPF record 457,715 (45.8%) Policy: hard fail 118,072 (11.8%) With a DMARC record 31,116 (3.1%) With a valid DMARC record 30,617 (3.1%) Policy: reject 4,200 (0.4%)

Table 1. Our measurement results of the SPF/DMARC adoption among Alexa top 1 million domains (2017).

DKIM usage from the DMARC statistics. As shown in Table 1, the adoption rates are still low compared with two years ago: in 2017, 45.8% of the domains published a valid SPF record and 3.1% have a valid DMARC record. In addition, a significant number of domains have made configuration errors for both SPF and DMARC.

Table 1 shows that even fewer domains specify a strict “reject” policy. A strict policy means the receiver is instructed to reject the email when the authenticate fails.1 We find only 11.8% of the domains set “hard fail” for SPF, and 0.4% set “reject” for DMARC. Recall that DKIM relies on DMARC to publish its policies. The fact that only 3.1% of the domains have a valid DMARC record (0.4% has a “reject” policy) indicates that most DKIM adopters also did not specify a strict policy.

Our measurement result raises serious concerns about the ef-fectiveness of the current spoofing defense. We are motivated to further explore the reasons behind the low adoption rates of anti-spoofing protocols.

Technology Adoption Theories

To provide the contexts for our study, we now introduce key theories for technology adoption [34, 28, 22, 23]. With a focus on network and security protocols, prior works have examined the adoption of DNSSEC [6], IPv6 [7], HTTPS [14], Bitcoin [5], and Biometric tracking [2]. Below, we discuss three adoption theories related to our study.

TAM. Technology Acceptance Model (TAM) is the most basic theory that models user intention to adopt new tech-nologies [43, 42]. The model describes the key factors that influence user decision, the most important of which are per-ceived usefulness(PU) andperceived ease-of-use(PEOU). In our context, “users” refer to email services. TAM has many extended versions with more factors added to the model (e.g., self-efficacy, quality of the system) [27, 43, 10, 32].

Network Externalities. For network protocols, the standard TAM is usually not sufficient to explain their adoption since individual user’s decision is likely to affect other users. This leads to the notion ofNetwork Externalities(or net effect) [22, 5]. Network externalities mean that an individual adopter can add the value for other people to adopt the same technology. In other words, when more users adopt the same protocol, the value of the protocol to each user will also increase [41]. For anti-spoofing protocols, if more domains publish their SPF/DKIM/DMARC records, it makes easier for other email providers to detect spoofing emails.

1_{If a domain has both SPF and DMARC, DMARC overwrites SPF as}

long as the DMARC policy is not “none”, DKIM relies on DMARC to publish its failing policy.

Cost-Benefit Model. Ozment and Schechter propose an adoption model that focuses on the cost-benefit perspec-tive [33]. The model argues that only when thebenefitsto individual adopters overweight the adoptioncostswill the pro-tocol be widely accepted. For network propro-tocols, the per-user benefits may grow as more users adopt the protocol (net ef-fect) [1]. The costs can be either constant or changing (mostly decreasing) as more users adopt the protocol.

Often cases, a network protocol requires a minimal level of deployment before creating enough benefits to overweight the costs. This leading to notions ofcritical masswhich represents the minimal number of adopters in order to facilitate self-sustaining adoption or create further growth [41].

RESEARCH QUESTIONS AND METHODOLOGY

In this paper, we conduct an exploratory study to understand the adoption of anti-spoofing protocols. We qualitatively look into two key questions. First, what are the reasons behind the relatively low rate of anti-spoofing protocols? Second, why did most domain owners configure the protocol with relaxed failing policies?

To answer these questions, we first tested the stand method (i.e., interviewing email administrators), and found it chal-lenging to execute in practice. First, our candidates need to have real-world experience developing or deploying anti-spoofing protocols. This narrows down the candidate pool to a small, highly specialized user population. Second, email administrators are often reluctant to participate due to the sen-sitivity of the topic. For many companies and organizations, details about their phishing/spoofing detection systems are non-disclosable. Even if the administrators agree to discuss related topics, it would require additional approval from their supervisors. In our pilot study, we contacted the administrators of Alexa top 4000 domains for a user study request and asked about the level of details of their experience that can be shared. We received 7 responses (0.175% return rate), who can only talk about the generic issues but cannot get into details. We decided not to use this method.

(4)

Working Group #Threads #Messages #Participants SPF 600 4,412 98 DKIM 3,177 17,309 217 DMARC 516 3,965 173 Total 4,293 25,686 410

Table 2. Dataset collected from IETF working groups.

Step 1: Coding IETF Discussion Threads

First, we analyze the discussion threads collected from IETF to understand the perceived value and limitations of SPF, DKIM and DMARC from the protocol designers’ perspectives.

Dataset. We collect the complete historical discussion threads for SPF, DKIM and DMARC from the IETF archive. The overall statistics are shown in Table 2. In total, we have 4,293 discussion threads and 25,686 messages from 410 dis-cussion participants.

Coding Method. Our goal is to extract high-level themes regarding each protocol’s limitations. We first tried topic modeling (Latent Dirichlet Allocation, or LDA [4]), seeking to obtain the “clusters” of topics for the discussion threads. However, we find that the clusters produced by LDA are highly noisy, which are barely usable for our purpose.

Instead of using LDA, we decide to manually code the discus-sion threads into high-level themes following the open coding method [16]. We mix all the discussion threads from the three discussion boards before the coding. This is because the topics from different boards are often related. For example, many discussion threads under DMARC are addressing the problems of SPF. We mark out the discussion threads that are related to the technical problems of these protocols. Then, we code the text and cluster them into high-level themes. These steps are executed by two authors independently. A third author resolves the conflicts in their results.

Step 2: Interview

Second, we interview two email administrators from our lo-cal institution to examine theprotocol adopters’ perceptions. One administrator operates the campus email system, and the other one runs a department-level email service (two services are independent). Both interviewees have over 10 years of experience operating email services. The small number of interviewees (from a single institution) is a clear limitation. This is due to the difficulty to recruit email administrators from other organizations and receive the approval forin-depth discussions. Even so, the two interviews have provided use-ful qualitative results to understand the perceived value and limitations of anti-spoofing protocols.

Our interview is semi-structured, primarily to discuss the email administrators’ personal experience of deploying and config-uring the anti-spoofing protocols. We have face-to-face inter-view sessions with a total length of 45 to 60 minutes. First, we discuss the value of the anti-spoofing protocols and the chal-lenges to deploy them in practice. Then, we discuss how email services and domain owners usually make their decisions to use the protocol. Finally, we discuss the possible solutions to

Domain displayed to users Domain used for SPF test

Return-Path: [email protected] Received-SPF: pass (domain of attacker.com designates 123.456.789.00 as permitted sender) Subject: Urgent! Response needed immediately. From: [email protected]

To: [email protected]

Figure 3. SPF: SPF test is based on the domain of “Return-Path”, which can be different from the domain that the user sees (the “From” field).

DKIM-Signature: v=1; a=rsa-sha256; d=attacker.com; s=pp-dkim1; c=relaxed/relaxed; q=dns/txt;

[email protected]; t=1480474251;

h=From:From:Subject:Date:To:MIME-Version:Content-Type; bh=…; b=...

Subject: Urgent! Response needed immediately. From: [email protected]

To: [email protected]Domain displayed to users

Domain that signs the message

Figure 4. DKIM: the domain used to sign the email can be different from the domain displayed to users (the “From” field).

email spoofing beyond the anti-spoofing protocols. We took audio recordings with their informed consent.

Our study is approved by IRB. We ensure that all the data are properly anonymized and securely stored. In the following, we report the detailed results and findings

RESULT: IETF DISCUSSION THREADS

By applying the coding methodology to the IETF dataset, we obtain the results shown in Table 3. For each protocol, we categorize and summarize the key limitations in the protocol design discussed by IETF working groups. In the following, we briefly describe each of the coded themes and discuss the implications to the protocol adoption in practice.

Usability Problems of SPF and DKIM

SPF and DKIM both have usability problems (P1and P4) related to “identifier alignment”. It means that the sender email address that user sees can be different from the address that is actually used to do perform authentication. Figure 3 and Figure 4 show how attackers can exploit this vulnerability to bypass SPF and DKIM while still [email protected].

For SPF, the authentication focuses on the “Return-Path” and examines whether the sender’s IP is listed in the “Return-Path” domain’s SPF record. In Figure 3, the attacker can set the “Return-Path” domain asattacker.com. Since the attacker has a full control overattacker.com, she can add the IP to the SPF record to pass the authentication. However, what the receiving user sees on the email interface [email protected]

set by the “From” field.

DKIM has a similar problem. DKIM authenticate emails by verifying the signature signed by the domain in the “d=” field. In Figure 4, the attacker uses her own domainattacker.com

(5)

Protocol Problem Type Problem Description Thread (Msg) SPF

P1. Alignment Usability The SPF verified sender address can be different from the one displayed to users. 8 (116) P2. Mail forward Security A forwarded email by default cannot pass the SPF test. 42 (554) P3. Mailing list Security Emails sent to a mailing list by default cannot pass the SPF test. 54 (615) DKIM

P4. Alignment Usability The sender domain that signed DKIM can be different from the one user sees. 10 (122) P5. Mailing list Security Mailing lists often modify the email content, which will fail the DKIM test. 201 (2525) P6. Non-email Security DKIM cannot protect non-email domains from being spoofed. 0 (0) DMARC+SPF P2. Mail forward_{P3. Mailing list} Security_Security A forwarded email by default cannot pass the SPF test, and thus fails DMARC._{Emails sent to a mailing list cannot pass SPF and DMARC at the same time.} 42 (554)_{54 (615)} DMARC+DKIM P5. Mailing list Security Mailing lists often modify the email content, which will fail the DKIM test. 201 (2525) DMARC+SPF+DKIM P5. Mailing list Security SPF always fails; DIKM will fail if the mailing list modifies email content. 201 (2525)

Table 3. Usability and security problems of SPF, DKIM and DMARC. We cluster the IETF discussion threads into different “problems” (P1–P5). P6 of DKIM is by design, which is listed here for completeness.

As shown in Table 3, DMARC+SPF or DMARC+DKIM miti-gate the problem by enforcing the alignment of the authenti-cated address and the address that is displayed to users. The problem is that DMARC has not been widely adopted yet (3.1% of the domains, Table 1).

Security Problems of SPF

SPF also has security problems that will make legitimate emails fail the authentication (P2andP3).

P2. Mail Forwarding. Mail forwarding means one email service automatically forwards emails to another email service. A common scenario is that university students often configure their university email service to forward all their emails to Outlook or Gmail.2SPF will fail after mail forwarding because the forwarder’s IP will not match the original sender’s SPF record. For example, a user configures her email serviceA

is to automatically forward emails to serviceB. When one of her emails is sent byCtoA, this email will be automatically forwarded toBbut the “Return-Path” still points toC. When

Bperforms the SPF check, it will fail becauseA’s IP will not match withC’s SPF record.

P3. Mailing List. When a message is sent to a mailing list, the mailing list will “broadcast” the message to all the subscribers. This is a similar process as mail forwarding. During this process, the mailing list’s IP will become the sender IP, which is different from the original sender’s IP. This will leads to SPF failure.

To avoid SPF failures for mail forwarding (or mailing list), one possible approach is to change the “Return-Path” to the for-warder’s domain. However, this creates more problems. First, the modified “Return-Path” will be different from the “From” field, which violates the DMARC alignment requirement. As shown in Table 3, P2 and P3 are still there for DMARC+SPF. In addition, if the user wants to reply the email, the modified “Return-Path” is no longer the original email sender, which

breaks the communication.

Security Problems of DKIM

DKIM’s security problems can also make legitimate emails to fail the authentication (P5). In addition, DKIM does not apply to non-email services by design (P6).

2_{Mail forwarding here refers to the automated forwarding between}

mail servers where the email meta data (e.g., “Return-Path”) remains unchanged. This is different from a user manually forwarding an email to another user, which is equivalent to sending a new email.

P5. Mailing List. Recall that DKIM’s signature is gen-erated by signing the email header and content. Mailing lists will cause trouble because most mailing lists modify the email content before broadcasting it to the subscribers. The common modification is to add a “footer” with the mailing list’s name and a link for un-subscription. Tempering the email content will cause DKIM failure.

P6. Non-Email domains. DKIM does not apply to non-email domain. P6 is based on the design of DKIM, not de-rived from the IETF discussion threads. We list P6 in Table 3 for completeness. For example,office.comhosts the web-site for Microsoft’s Office software, but it is not an email domain (no MX server). Even so, attackers can still spoof

[email protected] phish Microsoft Office users. The do-main owner of office.com needs to publish an SPF and DMARC record to prevent this domain from being spoofed.

Security Problems of DMARC

As shown in Table 3, DMARC helps to solve some of the usability and security problems of SPF and DKIM. However, when a domain adopts DMARC, it may not have both SPF and DKIM in place. So problem P2, P3, and P5 may still exist. Failures of SPF and DKIM will cause DMARC to fail. This explains why DMARC tends to use relaxed failing policies, which is to avoid accidentally rejecting legitimate emails.

Even if a domain sets up all three protocols, the mailing list is still a big problem. For mailing lists, DMARC+SPF will be sure to fail — if the “Return-Path” is modified, DMARC will fail due to the misalignment of identifiers; if the “Return-Path” is unmodified, SPF will fail due to the IP mismatch. For DMARC+DKIM, the situation is better as long as the mailing list does not temper the email content. However, in practice, mailing lists are obligated to inform users of the mailing list’s name and ways to unsubscribe. This problem is under active discussions within the IETF working group [21].

Validation and Summary

(6)

Adoption Rate (Fraction of Domains)

0 1

$

(a) Email Domains

Benefit

Base cost Cost to han_dle insecure do_mains

Total Cost

Adoption Rate (Fraction of Domains)

0 1

$

(b) Non-Email Domains

Benefit Cost

Figure 5. The adoption model for anti-spoofing protocols. For email do-mains, the cost and benefit changes as more domains adopt the protocol. For non-email domains, the cost and benefit stay constant.

limitation could affect theperceived usefulnessandperceived ease-of-useof these protocols. In the next step, we further val-idate our observations by conducting interviews and examine the perceived values and flaws of SPF, DKIM and DMARC fromprotocol adopters’ perspectives.

RESULT: INTERVIEW

Our interviews have focused on questions regarding the adop-tion of SPF, DKIM and DMARC. Before the interview, we make it clear that interviewees don’t need to disclose spe-cific details about the email services that they manage. Even though we only interviewed two email administrators, their responses are quite consistent. They have acknowledged the technical limitations of SPF, DKIM and DMARC, and more importantly, pointed out the non-technical challenges for their adoption. In the following, we refer the campus-level email admin as U1and the department-level admin as U2. We process the interview transcripts in a round of open coding, where the data is conceptualized and coded. Then we extract 4 high-level themes regarding why people are hesitated to adopt anti-spoofing protocols.

Theme 1: A Lack of Critical Mass

BothU1andU2mentioned that there had not been a global consensus that SPF, DKIM or DMARC should be the ultimate solution to stop spoofing. Part of the reason is these protocols are struggling to support common email scenarios such as mail forwarding (U1), and the perceived usefulness is not high enough. Potential adopters might be are waiting to see whether enough people would eventually get on board.

“U2: It is not the final answer that the industry picked up yet. I felt at this point that enough people haven’t really adopted it, it’s not worth for me to set it up.”

This reflects a typical bootstrapping challenge, where a “crit-ical mass” is needed in order to facilitate a self-sustaining adoption process [33].

Theme 2: Benefits Not Overweight Costs

U1andU2also discussed the deeper reasons of the lack of critical mass. More specifically, the protocol adopter does not directly benefit from publishing their SPF, DKIM or DMARC records to the DNS. In fact, the published DNS records are pri-marily helpingother email servicesto verify incoming emails and protecting the customers (users) of other email services.

For domains that publish the DNS records, the benefits are relatively small.

“U1: If I am an email provider, I am not motivated to set up SPF, I am motivated to make sure people who have sent (emails) to my customers have set SPF. I am motivated to evaluate it.”

“U2: You are relying on everyone else setting it up.”

Figure 5 illustrates the cost and benefit trade-off. We follow the Ozment-Schechter model [33] with specific justifications for email spoofing protocols. We also build separated models for email domains and non-email domains.

For email domains (Figure 5(a)), when more domains publish their SPF, DKIM or DMARC records, the benefits for each adopter will increase because more incoming emails can be authenticated. Regarding the costs, there will be a constant base costfor deploying the protocol. On top of that, early adopters also need to handle the insecure domains that have not adopted the protocol and those with misconfigurations. The cost of insecure domains will drop as more domains adopt the protocol. However, this cost cannot reach zero due to the technical issues in protocol design (Table 3).

Figure 5(b) shows a bigger challenge to motivate non-email mains to publish the SPF/DMARC record. For non-email do-mains (e.g.,office.com), publishing SPF/DMARC records is primarily helping other email services to detect spoofing (i.e. being a good “citizen”). The non-email domains themselves do not receive real benefits other than a better reputation.

Overall, the cost and benefit model is not in favor of creating a “critical mass” for a wide adoption. The bootstrapping phase

is challenging without external enforcement or incentives.

Theme 3: Deployment Difficulties in Practice

Even if an email administrator decided to deploy the protocol, there would be other challenges in the way. For example, certain services do not have a control over their DNS record. Publishing SPF/DKIM/DMARC record will incur additional overhead to coordinate with their DNS providers (U1andU2).

“U1: So we have very limited control over our DNS. Right now, it is just the difficulty of setting up that DNS.”

Another challenge is that companies and organizations may not maintain their own mail servers, but rely on cloud-based email services. It is not their decision to deploy the anti-spoofing protocol if the protocol is not supported by the cloud email service (U1).

Theme 4: Risks of Breaking the Existing System

U2also discussed the concerns of breaking the existing email system due to unfamiliarity to the protocol. This is particularly true for DMARC (published in 2015). Email providers need to go through careful testing to make sure the protocol does not block legitimate incoming emails, and their own emails are not blocked by others.

(7)

them. Relatively it (DMARC) is still pretty new for big businesses and such.”

U1andU2both mentioned that in general there was no penalty to domains for not publishing an SPF/DKIM/DMARC record. Their emails are typically not discriminated, unless other mali-cious signals are detected.

DISCUSSION

So far, we have described the key challenges for SPF, DKIM and DMARC to receive a wide adoption. Next, we discuss the implications of these challenges to future protocol design and possible solutions moving toward.

Improving the Perceived Usefulness. The security and usability issues in SPF, DKIM and DMARC negatively impact their perceived usefulness, which is a determining factor for the protocol adoption (TAM [43]). Using all 3 protocols helps to mitigate some of the problems, but not all of them (e.g., the “mailing list” issue still persists).

To improve the perceived usefulness, addressing these security and usability issues becomes the first priority. Currently, an IETF group is working on a new protocol called Authenticated Received Chain (ARC) [21] which is expected to address the mailing list problem of DMARC. However, this also adds to the number of protocols that domain owners need to deploy. New protocols will have their own challenges to be accepted. For example, the DMARC protocol, even though incremen-tally deployable, only achieved a 3.1% adoption rate in the past two years. To this end, it is not clear if creating another brand new protocol is the best solution.

Building the Critical Mass. Currently, there is a lack of strong consensus to deploy anti-spoofing protocols. For email services, they are motivated to perform the authenti-cation check to protect their own users, but are not as moti-vated to publish SPF/DKIM/DMARC records to help other email services. For non-email services, the incentives are even weaker. As illustrated in Figure 5(a), only after enough domains adopted anti-spoofing protocols can these protocols produce real benefits. This lead to a bootstrapping problem.

To bootstrap the adoption and establish a critical mass, exter-nal mechanisms are often needed. In theory, we can adjust the rewarding function to provide more benefits to early adopters to create a positive net effect [33]. In addition, policymakers or major email providers may consider enforcing certain (sen-sitive) domains to publish their SPF/DKIM/DMARC records. The question is how to realize these ideas in practice without disrupting the operation of the current email system. This requires further research efforts and careful experimentations.

LIMITATIONS

Our study has several limitations. First, the scale of the study (particularly the interview) is very small. This limits us from producing any statistically significant results. We argue that our contribution is to provide a “qualitative” understanding of the problem space, which lays the ground work for future quantitative research. For example, one future direction is to conduct surveys to understand what types of domains are

more likely to adopt anti-spoofing protocols, and how domain attributes (e.g., service type, popularity, sensitivity) affect the domain owners’ decision. Second, our coding method for the IETF dataset requires significant manual efforts, which is not ideal. There is a possibility that advanced text mining tools (beyond LDA) can help to make this process more manageable.

CONCLUSION

In this paper, we examine why email spoofing is (still) possible in today’s email system. First, our measurement results con-firm that anti-spoofing protocols (SPF, DKIM, DMARC) are not widely accepted. Then we qualitatively study the possible reasons of the low adoption rates. By analyzing the discussion threads in IETF and interviewing email administrators, we provide a deeper understanding of the perceived value and limitations of anti-spoofing protocols. Our results show that key security and usability limitations are rooted in the pro-tocol design which hurts the perceived usefulness of these protocols. This also makes it difficult to establish a “critical mass” to facilitate a positive net effect for a wider adoption. Moving forward, extensive efforts are needed to address the technical issues in the protocol design, and develop external enforcement (or incentives) to bootstrap the protocol adoption.

REFERENCES

1. Bernard Aboba and Dave Thaler. 2008. What Makes For a Successful Protocol?RFC5218(2008).

https://tools.ietf.org/html/rfc5218.

2. Syed Ishtiaque Ahmed, Md. Romael Haque, Shion Guha,

Md. Rashidujjaman Rifat, and Nicola Dell. 2017. Privacy, Security, and Surveillance in the Global South: A Study of Biometric Mobile SIM Registration in Bangladesh. InProc. of CHI’17.DOI:

http://dx.doi.org/10.1145/3025453.3025961

3. Alexa. 2017. (2017).http://www.alexa.com.

4. David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation.Journal of machine Learning research3 (2003), 993–1022.

http://dl.acm.org/citation.cfm?id=944919.944937

5. Rainer Böhme. 2013. Internet protocol adoption: Learning from Bitcoin. InProc of IAB Workshop on Internet Technology Adoption and Transition (ITAT’13).https://www.iab.org/wp-content/ IAB-uploads/2013/06/itat-2013_submission_17.pdf. 6. Taejoong Chung, Roland van Rijswijk-Deij, David Choffnes, Dave

Levin, Bruce M. Maggs, Alan Mislove, and Christo Wilson. 2017. Understanding the Role of Registrars in DNSSEC Deployment. InProc. of IMC’17.

7. Lorenzo Colitti, Steinar H. Gunderson, Erik Kline, and Tiziana Refice. 2010. Evaluating IPv6 Adoption in the Internet. InProc of PAM’10.DOI:

http://dx.doi.org/10.1007/978-3-642-12334-4_15

8. M. Kucherawy D. Crocker, T. Hansen. 2011. DomainKeys Identified Mail (DKIM) Signatures(RFC6376).DOI:

http://dx.doi.org/10.17487/RFC6376

9. Prateek Dewan, Anand Kashyap, and Ponnurangam Kumaraguru. 2014. Analyzing social and stylometric features to identify spear phishing emails. InProc. of eCrime’14.DOI:

http://dx.doi.org/10.1109/ECRIME.2014.6963160

10. Timm Dickfeld. 2011. Quo vadis ventricular tachycardia ablation: new horizons and developments.Expert Review of Medical Devices8, 2 (2011), 131–133.DOI:http://dx.doi.org/10.1586/erd.10.82

11. Sevtap Duman, Kubra Kalkan-Cakmakci, Manuel Egele, William K. Robertson, and Engin Kirda. 2016. EmailProfiler: Spearphishing Filtering with Header and Stylometric Features of Emails. InProc. of ACSAC’16.DOI:

(8)

12. Zakir Durumeric, David Adrian, Ariana Mirian, James Kasten, Elie Bursztein, Nicolas Lidzborski, Kurt Thomas, Vijay Eranti, Michael Bailey, and J. Alex Halderman. 2015. Neither Snow Nor Rain Nor MITM: An Empirical Analysis of Email Delivery Security. InProc. of IMC’15.DOI:http://dx.doi.org/10.1145/2815675.2815695

13. FBI. 2016. GRIZZLY STEPPE: Russian Malicious Cyber Activity. FBI and DHS report. (2016).https:

//www.us-cert.gov/sites/default/files/publications/ JAR_16-20296A_GRIZZLY%20STEPPE-2016-1229.pdf.

14. Adrienne Porter Felt, Richard Barnes, April King, Chris Palmer, Chris Bentzel, and Parisa Tabriz. 2017. Measuring HTTPS Adoption on the Web. InProc. of USENIX Security’17.

https://www.usenix.org/conference/usenixsecurity17/ technical-sessions/presentation/felt.

15. Ian Fette, Norman Sadeh, and Anthony Tomasic. 2007. Learning to Detect Phishing Emails. InProc. of WWW’07.

http://doi.acm.org/10.1145/1242572.1242660

16. Uwe Flick. 2014.An introduction to qualitative research. Sage. 17. Ian D. Foster, Jon Larson, Max Masich, Alex C. Snoeren, Stefan Savage,

and Kirill Levchenko. 2015. Security by Any Other Name: On the Effectiveness of Provider Based Email Security. InProc. of CCS’15.

DOI:http://dx.doi.org/10.1145/2810103.2813607

18. Grant Ho, Aashish Sharma, Mobin Javed, Vern Paxson, and David Wagner. 2017. Detecting Credential Spearphishing in Enterprise Settings. InProc. of USENIX Security’17.https://www.usenix.org/system/ files/conference/usenixsecurity17/sec17-ho.pdf.

19. Jason Hong. 2012. The State of Phishing Attacks.Commun. ACM55, 1 (2012).DOI:http://dx.doi.org/10.1145/2063176.2063197

20. Tom N. Jagatic, Nathaniel A. Johnson, Markus Jakobsson, and Filippo Menczer. 2007. Social Phishing.Commun. ACM50, 10 (2007).DOI:

http://dx.doi.org/10.1145/1290958.1290968

21. Ed. S. Jones Ed. M. Kucherawy Ed. K. Andersen, B. Long. 2017. Authenticated Received Chain (ARC) Protocol(Internet-Draft’17).

https://tools.ietf.org/html/ draft-ietf-dmarc-arc-protocol-09.

22. Michael L Katz and Carl Shapiro. 1986. Technology adoption in the presence of network externalities.Journal of political economy94, 4 (1986), 822–841.DOI:http://dx.doi.org/10.1086/261409

23. Sunyoung Kim, Jennifer Mankoff, and Eric Paulos. 2015. Exploring Barriers to the Adoption of Mobile Technologies for Volunteer Data Collection Campaigns. InProc. of CHI’15.DOI:

http://dx.doi.org/10.1145/2702123.2702378

24. S. Kitterman. 2014. Sender Policy Framework (SPF)(RFC7208).

25. Viktor Krammer. 2006. Phishing Defense Against IDN Address Spoofing Attacks. InProc. of PST’06.DOI:

http://dx.doi.org/10.1145/1501434.1501473

26. Ponnurangam Kumaraguru, Yong Rhee, Alessandro Acquisti, Lorrie Faith Cranor, Jason Hong, and Elizabeth Nunge. 2007. Protecting People from Phishing: The Design and Evaluation of an Embedded Training Email System. InProc. of CHI’07.DOI:

http://dx.doi.org/10.1145/1240624.1240760

27. PC Lai. 2017. The Literature Review of Technology Adoption Models and Theories For The Novelty Technology.Journal of Information Systems and Technology Management14 (04 2017), 21 – 38.DOI:

http://dx.doi.org/10.4301/S1807-17752017000100002

28. Joseph Lindley, Paul Coulton, and Miriam Sturdee. 2017. Implications for Adoption. InProc. of CHI’17.DOI:

http://dx.doi.org/10.1145/3025453.3025742

29. E. Zwicky M. Kucherawy. 2015. Domain-based Message Authentication, Reporting, and Conformance (DMARC)(RFC7489).

30. Malwarebytes. 2016. Understanding the Depth of the Global Ransomware Problem. (2016).https://www.malwarebytes.com/ pdf/white-papers/UnderstandingTheDepthOf

RansomwareIntheUS.pdf.

31. D. Kevin McGrath and Minaxi Gupta. 2008. Behind Phishing: An Examination of Phisher Modi Operandi. InProc. of LEET’08.DOI:

http://dx.doi.org/citation.cfm?id=1387713

32. Chuttur M.Y. 2009. Overview of the Technology Acceptance Model: Origins, Developments and Future Directions.Sprouts: Working Papers on Information Systems9, 37 (2009).

http://sprouts.aisnet.org/9-37.

33. Andy Ozment and Stuart E Schechter. 2006. Bootstrapping the Adoption of Internet Security Protocols. InProc. of WEIS’06.http:// andyozment.com/papers/Ozment_Schechter-Bootstrapping_ the_Adoption_of_Internet_Security_Protocols.pdf. 34. Stephen L Parente and Edward C Prescott. 1994. Barriers to technology

adoption and development.Journal of political Economy102, 2 (1994), 298–321.http:

//www.journals.uchicago.edu/doi/pdfplus/10.1086/261933. 35. PhishMe. 2016. Ransomware Delivered by 97% of Phishing Emails by

end of Q3 2016 Supporting Booming Cybercrime Industry. (2016).

https://phishme.com/ransomware-delivered-97-phishing-emails-end-q3-2016-supporting-booming-cybercrime-industry/. 36. J. B. Postel. 1982. Simple Mail Transfer Protocol(RFC821).

37. Pawan Prakash, Manish Kumar, Ramana Rao Kompella, and Minaxi Gupta. 2010. Phishnet: Predictive Blacklisting to Detect Phishing Attacks. InProc. of INFOCOM’10.DOI:

http://dx.doi.org/10.1109/INFCOM.2010.5462216

38. Proofpoint. 2016. Threat Summary and Year in Review. (2016).

https://www.proofpoint.com/sites/default/files/ proofpoint_q4_threat_report-final.pdf.

39. P. Resnick. 2001. Internet Message Format(RFC5321).

https://www.ietf.org/rfc/rfc2822.txt.

40. Mark Risher. 2017. Protecting You Against Phishing. Google Security Blog. (2017).https://security.googleblog.com/2017/05/ protecting-you-against-phishing.html.

41. Everett M. Rogers. 2003.Diffusion of Innovations, 5th Edition(5th ed.). 42. Viswanath Venkatesh and Hillol Bala. 2008. Technology Acceptance

Model 3 and a Research Agenda on Interventions.Decision Sciences39, 2 (2008), 273–315.DOI:

http://dx.doi.org/10.1111/j.1540-5915.2008.00192.x

43. Viswanath Venkatesh and Fred D. Davis. 2000. A Theoretical Extension of the Technology Acceptance Model: Four Longitudinal Field Studies.

Management Science46, 2 (2000), 186–204.DOI:

http://dx.doi.org/10.1287/mnsc.46.2.186.11926