Issue with DNS on GCP using FortiGate HA example

Question

Issue with DNS on GCP using FortiGate HA example

Closed this issue a year ago · 10 comments

Using the Terraform HA example, I'm facing one issue with the instances I've attached to my private subnet. When I try to ping a domain such as gmail.com or google.com from one of the instances that are being routed through FortiGate, the public IP of the destination is different from the test instances, and from FortiGate instances (from FortiGate, the ping is ok), seems a DNS issue, but I'd like to understand what is a best practice with FortiGate on GCP since they use the internal metadata server (169.254.169.254) for DNS and how to configure on FortiGate?

One important detail, I've also deployed FortiGate from the GCP marketplace, and one thing that is different and probably the fix for the issue is that I can see on the network/DNS the "Dynamically Obtained DNS Servers" with the interface "Port1" and the DNS Server as "169.254.169.254", and this value in the Terraform HA example I don't have it. Maybe it is a problem with the SDN connector or a permission, but I've tried to find it in the Fortinet documentation and didn't find anything related.

Answer 1 · 2023-06-21T16:23:32.000Z

Hi CledersonE,

In regarding to the dns server 169.254.169.254. If you need to use the google's (169.254.169.254) dns server. Then can just do this in your dns setting.

config system dns
set primary 169.254.169.254
set server-select-method failover
end

And, since you are using HA, and it's usually using static instead of dhcp. So, in this case, if you need to configure using google dns server then, can do that after the deployment. As by default it will be using Fortinet's dns server.

Hope that helps.

Answer 2 · 2023-06-21T17:10:49.000Z

Hello @mobilesuitzero

I've tried already before opening the issue setting the google DNS server and then I can't reach the internet anymore (even from Fortigate). Tried with the FortiGate cli as well and I also facing the same problem after applying I loose connectivity with internet execute ping google.com from the cli for example.

Answer 3 · 2023-06-22T16:27:50.000Z

Hi @CledersonE

Can you double check if you have the proper route or if firewall is open on the GCP side to allow traffic to the 169.254.169.254

In my setup, I was able to ping and also able to see the dns traffic to it.

active # get router info routing-table details
Codes: K - kernel, C - connected, S - static, R - RIP, B - BGP
O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area
V - BGP VPNv4
* - candidate default

Routing table for VRF=0
S* 0.0.0.0/0 [5/0] via 172.16.0.1, port1, [1/0]
S 10.128.0.0/16 [10/0] via 172.16.1.1, port2, [1/0]
S 172.16.0.0/24 [5/0] via 172.16.0.1, port1, [1/0]
S 172.16.0.1/32 [5/0] is directly connected, port1, [1/0]
C 172.16.0.2/32 is directly connected, port1
S 172.16.1.0/24 [10/0] via 172.16.1.1, port2, [1/0]
S 172.16.1.1/32 [10/0] is directly connected, port2, [1/0]
C 172.16.1.2/32 is directly connected, port2

active # exec ping 169.254.169.254
PING 169.254.169.254 (169.254.169.254): 56 data bytes
64 bytes from 169.254.169.254: icmp_seq=0 ttl=255 time=1.0 ms
64 bytes from 169.254.169.254: icmp_seq=1 ttl=255 time=1.4 ms
64 bytes from 169.254.169.254: icmp_seq=2 ttl=255 time=1.2 ms
64 bytes from 169.254.169.254: icmp_seq=3 ttl=255 time=1.3 ms
64 bytes from 169.254.169.254: icmp_seq=4 ttl=255 time=1.1 ms

active # exec ping www.fortinet.com
PING lb-2.us-east-2.aws.waas-online.net (18.216.71.25): 56 data bytes
64 bytes from 18.216.71.25: icmp_seq=0 ttl=59 time=28.3 ms
64 bytes from 18.216.71.25: icmp_seq=1 ttl=59 time=27.6 ms
64 bytes from 18.216.71.25: icmp_seq=2 ttl=59 time=27.5 ms
64 bytes from 18.216.71.25: icmp_seq=3 ttl=59 time=27.6 ms
64 bytes from 18.216.71.25: icmp_seq=4 ttl=59 time=27.6 ms

Cheers

Answer 4 · 2023-06-30T12:59:09.000Z

@mobilesuitzero Hello, sorry for the delay, but I only had time to test this yesterday.

Here is my router config:

active # get router info routing-table details
Codes: K - kernel, C - connected, S - static, R - RIP, B - BGP
       O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area
       V - BGP VPNv4
       * - candidate default

Routing table for VRF=0
S*      0.0.0.0/0 [10/0] via 172.16.0.1, port1, [1/0]
C       172.16.0.0/24 is directly connected, port1
S       172.16.1.0/24 [10/0] via 172.16.1.1, port2, [1/0]
C       172.16.1.2/32 is directly connected, port2
S       172.16.50.0/24 [10/0] via 172.16.1.1, port2, [1/0]


active #

I've set the DNS using the command as you mentioned:

active # config system dns
set primary 169.254.169.254
set server-select-method failover
end

But with that, the second DNS server keeps one from Fortiguard, and until now I can ping google or any other address:

active # show system dns
config system dns
    set primary 169.254.169.254
    set secondary 96.45.46.46
    set protocol dot
    set server-hostname "globalsdns.fortinet.net"
    set server-select-method failover
end

exec ping google.com
PING google.com (142.250.72.110): 56 data bytes
64 bytes from 142.250.72.110: icmp_seq=0 ttl=117 time=26.2 ms
64 bytes from 142.250.72.110: icmp_seq=1 ttl=117 time=25.7 ms
64 bytes from 142.250.72.110: icmp_seq=2 ttl=117 time=25.8 ms

But I continue facing my problem from another instance that will have an Outbound Firewall policy applied. When I try to ping google, it is using another IP address, and it is not receiving the reply:

[lab@app-instance-0 ~]$ ping google.com
PING google.com (74.125.70.139) 56(84) bytes of data.
^C
--- google.com ping statistics ---
44 packets transmitted, 0 received, 100% packet loss, time 42999ms

This seems to be a DNS issue since the Fortigate is using a different DNS compared with the lab instance that I'm using.

On Fortigate, when I remove the second DNS server from the configs and only keep the GCP DNS, then I lose communication with the internet:

active # show system dns
config system dns
    set primary 169.254.169.254
    set protocol dot
    set server-hostname "globalsdns.fortinet.net"
    set server-select-method failover
end

active # exec ping 169.254.169.254
PING 169.254.169.254 (169.254.169.254): 56 data bytes
64 bytes from 169.254.169.254: icmp_seq=0 ttl=255 time=0.4 ms
64 bytes from 169.254.169.254: icmp_seq=1 ttl=255 time=0.7 ms
64 bytes from 169.254.169.254: icmp_seq=2 ttl=255 time=0.8 ms
^C
--- 169.254.169.254 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.4/0.6/0.8 ms

active # exec ping google.com
Unable to resolve hostname.

In your setup, can you remove the second DNS server to make sure that it will only use the GCP one 169.254.169.254 and try to reach the internet?

Answer 5 · 2023-06-30T16:36:23.000Z

Hi @CledersonE ,

In your configuration, you are setting the dns protocol to dot. You might need to use cleartext in order to do dns query with 169.254.169.254.
config system dns
set protocol cleartext
end

Hope that helps.

Cheers

Answer 6 · 2023-06-30T19:35:39.000Z

Hi @mobilesuitzero

Interesting, since I didn't configure that manually (maybe it is in the template?). After changing this setting, I can reach the internet from the FortiGate, but I'm still facing the issue from the instance inside my private network. If I return to the default FortiGate DNS servers, my private instances can reach the internet again.
In the resolv.conf from the instances, it is using the 169.254.169.254 as well.

In your setup, can you try to deploy an instance that will use a Firewall policy and reach google, gmail or any other url that has multiple ip addresses?

Answer 7 · 2023-07-04T22:03:59.000Z

Hi @CledersonE ,

By default it's using dot, if you need to use cleartext, you need to configure that later.

In my setup, when I ping from a instance that is behind Fortiate to google.ca, I was able to see traffic traverse through the FGT, and able to see the reply.

fgtvm # diag sniffer packet any 'icmp' 4
Using Original Sniffing Mode
interfaces=[any]
filters=[icmp]
2.064172 port2 in 172.16.1.5 -> 142.250.1.94: icmp: echo request
2.064226 port1 out 172.16.0.2 -> 142.250.1.94: icmp: echo request
2.065777 port1 in 142.250.1.94 -> 172.16.0.2: icmp: echo reply
2.065791 port2 out 142.250.1.94 -> 172.16.1.5: icmp: echo reply

Cheers

Answer 8 · 2023-07-05T15:43:51.000Z

Hello @mobilesuitzero

To make sure that I'm using the same lab as you mentioned, I've destroyed everything and created it from scratch using the HA template with an empty GCP project. The problem persists but only starts when I've changed my firewall policy. Here is what I've done step-by-step so maybe you can understand better and try to reproduce the issue:

Deployed Fortigate using the HA template;
Created a test instance using the default GCP OS configs (Debian 11) in the same VPC where I have the private subnets;
Created a firewall policy on GCP just to be able to SSH in the test instance from the internet;
Changed the DNS config to only use the 169.254.169.254:

config system dns
set primary 169.254.169.254
set secondary 0.0.0.0
set server-select-method failover
set protocol cleartext
end

Created an outbound firewall policy granting access to everything on the internet:

active # show firewall policy 
config firewall policy
    edit 1
        set name "Outbound-Test"
        set uuid 4e40619e-1b48-51ee-f6da-94585b3a2d52
        set srcintf "port2"
        set dstintf "port1"
        set action accept
        set srcaddr "all"
        set dstaddr "all"
        set schedule "always"
        set service "ALL"
        set logtraffic-start enable
        set nat enable
    next
end

Until now, I can access from the test instance that it is behind the Fortigate (ping gmail.com):

active # diag sniffer packet any 'icmp' 4
Using Original Sniffing Mode
interfaces=[any]
filters=[icmp]
5.466030 port2 in 172.16.1.4 -> 142.251.171.102: icmp: echo request
5.466135 port1 out 172.16.0.2 -> 142.251.171.102: icmp: echo request
5.468739 port1 in 142.251.171.102 -> 172.16.0.2: icmp: echo reply
5.468788 port2 out 142.251.171.102 -> 172.16.1.4: icmp: echo reply
6.467676 port2 in 172.16.1.4 -> 142.251.171.102: icmp: echo request
6.467708 port1 out 172.16.0.2 -> 142.251.171.102: icmp: echo request
6.468821 port1 in 142.251.171.102 -> 172.16.0.2: icmp: echo reply
6.468847 port2 out 142.251.171.102 -> 172.16.1.4: icmp: echo reply
7.469352 port2 in 172.16.1.4 -> 142.251.171.102: icmp: echo request
7.469375 port1 out 172.16.0.2 -> 142.251.171.102: icmp: echo request

Changed the outbound firewall to only accept access to gmail.com:

config firewall policy
    edit 1
        set name "Outbound-Test"
        set uuid 4e40619e-1b48-51ee-f6da-94585b3a2d52
        set srcintf "port2"
        set dstintf "port1"
        set action accept
        set srcaddr "all"
        set dstaddr "gmail.com"
        set schedule "always"
        set service "ALL"
        set logtraffic-start enable
        set nat enable
    next
end

The problem starts from the test instance behind the FortiGate (ping gmail.com):

active # diag sniffer packet any 'icmp' 4
Using Original Sniffing Mode
interfaces=[any]
filters=[icmp]
5.150055 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
6.169216 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
7.193139 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
8.217146 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
9.241023 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
10.264981 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
11.288895 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
12.312838 port2 in 172.16.1.4 -> 173.194.194.17: icmp: echo request
^C
8 packets received by filter
0 packets dropped by kernel

If I come back to the Firewall policy to grant access to all, then I receive the reply from the test instance correctly again

Can you try to reproduce that and see if it fails? I think still is something about the DNS between FortiGate instances and the instances behind Fortigate that are trying to use different DNS servers;

Answer 9 · 2023-07-05T18:18:17.000Z

Hi @CledersonE

Looks like you are talking about the FQDN address resolving in the FOS itself.

As that isn't in the scope of the terraform script as it's more on the FOS side, might be best if you engage with support on that.

However, you can take a look at the following URL to troubleshoot it.
https://community.fortinet.com/t5/FortiGate/Technical-Tip-FQDN-based-firewall-policies-are-not-working/ta-p/196844
https://community.fortinet.com/t5/FortiGate/Troubleshooting-Tip-How-to-verify-the-FDQN-IP-address-in-DNS/ta-p/197321
https://community.fortinet.com/t5/FortiGate/Technical-Tip-Explanation-of-the-FQDN-nbsp-default-nbsp-cache/ta-p/213280

Hope that helps.

Cheers

Answer 10 · 2023-07-05T20:21:15.000Z

Hi @mobilesuitzero
Thanks for the support so far, I will take a look at the links that you sent, and if I still have problems will open a support ticket. In addition, once the problem is fixed, I'll put it here to clarify the issue if anybody else has it.